Surveys and Attitude Measurement The reason surveys seem to be everywhere is that they are tremendously flexible— you can ask people about anything, and most people will give you answers on just about anything. They aren’t perfect measuring instruments, of course, because people can shade what they choose to tell you, and survey methodologists spend a great deal of time and angst constructing surveys that are as minimally prone to problems as possible and by building in checks when they can. The topic of survey methodology is so extensive that it cannot be covered comprehensively here, but it is also a topic more appropriate for Marketing Research texts (such as that by Iacobucci and Churchill!) so we will assume some basic familiarity with survey design and sampling as we proceed. For this book on models, let us pick up on how the goodness of surveys is tested. No single item is perfect—but the hope is that an aggregate of several items will be more solid and provide a reasonably precise and accurate measure of customer sentiment (random errors on one item should cancel out those on another, etc.). Specifically, measurement theory says that a person’s answer to a question, X, is a function of their true opinion, T, plus some error ε, where the error is random and roughly normally distributed (so the positives cancel the negatives): X = T + ε.
Reliability Psychometricians speak of a scale’s reliability and validity. Reliability may be assessed in several ways. “Test-retest” reliability is a measure of the consistency of responses on the same survey from a sample at two points in time, and “alternate forms” reliability measures the consistency of responses from a sample on two surveys that are similar but not identical. These tests of reliability are rarely conducted in the real world because marketers feel lucky enough when they get customers to answer a survey once—to get them to do so again is usually unrealistic. More frequently, reliability is conceptualized and measured as “internal consistency,” that is, if part of a survey presents 5 questions asking customers about their service provider, then those 5 items should be fairly highly inter-correlated among themselves. If another section of the survey has 4 questions that ask customers about their perceptions of the fairness of the brand’s prices, those items should be correlated among themselves as well. (Whether the 5 questions about the service provider and the 4 questions about pricing are correlated is a completely different question that we’ll address later.)
Reliability The extent to which a set of items hang together is captured in two different indices. First, we can compute “item-total correlations.” As the name suggests, we would compute the correlation between question 1 and the average of questions 2, 3, 4, and 5 (we extract question 1 temporarily otherwise it would spuriously inflate the extent to which question 1 was correlated with the “total” given that the total would include the item itself). We then proceed to compute the correlation between item 2 and the average of items 1, 3, 4, and 5, and so on. Low item-total correlations are indicative of items that are candidates for extracting before subsequent modeling—it would seem that item doesn’t measure what the others are capturing, thus to include it would be including noise.
Reliability A second index of internal consistency is called Coefficient Alpha, defined as, where p is the number of items in the scale (5 for our example), is the variance of one of the items in the scale and is the variance of the entire scale (thus,, a sum over the variances of all p items and all of their covariances—remember that a covariance, σ ij is like a correlation coefficient in that it measures the extent of a linear relationship between items i and j, but the correlation coefficient is scaled by the two items’ standard deviation, which results in that nice property that -1.0 < r ij < +1.0; the covariance, σ ij = r ij /(σ i σ j ), thus is like summing over the whole p p correlation matrix, but is actually the sum over the whole covariance matrix). Coefficient Alpha is closest to 1.0 (high reliability) if the items all hang together—then the sum of ’s (in the numerator) will be small compared to in the denominator ( sums those individual item variances but also all the covariances (like correlations), so it should be larger). If that ratio is small, then 1 minus the ratio (the quantity in parentheses) will be large, and a large Coefficient Alpha can be achieved. For more information, go online. It is good practice to have multi-item scales on surveys, but in the real world, replicate items often get lopped off in the interest of keeping surveys short.
Validity Content Construct Criterion-Related
Coefficient Alpha References Duhachek, Adam, Anne T. Coughlan and Dawn Iacobucci (2005), “Results on the Standard Error of the Coefficient Alpha Index of Reliability,” Marketing Science, 24 (2), Duhachek, Adam and Dawn Iacobucci (2004), “Alpha’s Standard Error (ASE): An Accurate and Precise Confidence Interval Estimate,” Journal of Applied Psychology, 89 (5), Iacobucci, Dawn and Adam Duhachek (2003), “Advancing Alpha: Measuring Reliability with Confidence,” Journal of Consumer Psychology, 13 (4),