Professor of Information Studies Reference Assessment Programs: Evaluating Current and Future Reference Services Using Surveys 2003 Toronto RUSA Preconference SARS=Satisfaction And Reference Service Dr. John V. Richardson Jr. Professor of Information Studies UCLA Department of Information Studies
Presentation Outline Why Survey Our Users? Question Design and Validity Concerns Method Issues and Mini Case Studies Recommended Readings Question Design and Validity Concerns: Intent of the Question Clarity of the Question Unidimensionality Scaling Number of Questions Method Issues: Timing of Administration Question Order Sample Sizes
Why Survey Our Users? Need to know what we don’t know Satisfaction and dissatisfaction Loyalty and the Internet User needs and expectations Can’t design effective, new programs Best practices Mission: The library’s mission is to provide outstanding services Goal: In order to achieve our mission, the goal must be to understand our users’ attitudes toward the library, and its services, especially reference Objective: My objective is to help you undertake a well designed attitudinal study of satisfaction/dissatisfaction, loyalty, and intent to return
Question Design and Validity Concerns Nine issues which must be addressed to insure validity of survey results: Intent of the question Clarity of the question Unidimensionality Scaling Number of questions to include Timing of administration Question order Sample sizes A well designed study takes into consideration these nine points
1. Intent of the Question RUSA Behavioral Guidelines (1996) Approachability Interest in the query, and Active listening skills UniFocus (300 factor analyses of the hospitality industry) Friendliness Helpfulness or accuracy Promptness of service What does your library care about? What do you want to know about? Customer satisfaction versus customer loyalty?
2. Clarity of the Question Data from unclear questions may be invalid Use instructions to enhance question clarity Editing, rewriting, and pre-testing are invaluable
Mini Case Study What is the literal correct answer to the question posed? Yes or No is the correct answer rather than a scaled response… Could it be rewritten? “How are we providing…”?
3. Unidimensionality Unidimensionality is a statistical concept that describes the extent to which a set of questions all measure the same topic How many questions do you need? Do they all measure the same thing?
Constellation of Attitudes Satisfaction Delight Intent to Return Feelings about experiences Value Loyalty “Satisfaction researchers in marketing are in general agreement that the emotion of delight is comprised of joy and surprise. This study reviews the relevant emotions literature in psychology, the neurosciences and philosophy to show that there may be two different kinds of delight - one with surprise and one without surprise. The work of Plutchik (1980) is often cited as the basis for conceptualizing delight as being comprised of joy and surprise. We replicated Plutchik's two studies using more positive complex emotion terms than the original study. It was found that subjects could feel delighted without being surprised and that there were different emotion terms that were considered by subjects to be comprised of joy and surprise. These results were validated in a second study in which consumer emotions and other responses were captured in a live setting during the intermission of an upbeat, fast tempo Irish Dance concert. The results show that consumers could be delighted even when they were not surprised. We show how these findings clarify and explain some unexpected results obtained in past research on customer delight. The implications of these findings for both theory and practice are also discussed."
RUSA Behavioral Guidelines Approachability Interest in the query Active listening skills The most studied and tested, perhaps more than any other, standard. One might design survey questions along these lines. Imprimatur of professional organization.
4. Scaling Three key characteristics: Does the scale have the right number of points (called response options)? Are the words used to describe the scale points appropriate? Is there a midpoint or neutral point on the scale? Take a look at your examples…
A. Response Options A common four point scale: Very good, good, fair, and poor Distance between very good and good is not the same as the distance between fair and poor Numeric values associated with these options: 4, 3, 2, and 1 may lead to invalid results… Four point is familiar, but…
Mini Case Study What is the distance between each of these response options? Technically, the question is framed to be answered as Yes or No.
B1. Scale Anchors VERY… VERY… Satisfied Dissatisfied Much Agree Much Disagree Positive Negative Valuable Costly Enjoyable Unpleasant Friendly Unfriendly Here are some good examples of scale anchors which are balanced…
Mini Case Study What are the scale anchors here? It is not 1; three point scale where the value of 1 is for N/A 4 and 2 are the anchors—doing well and needs improvement. Parallel? Not okay is….better than ok is…. What if this question is skipped?
B2. Seven Point Scales Scale A: Very good Very Poor N/A 7 6 5 4 3 2 1 0 Scale B: Excellent Very Poor N/A 7 6 5 4 3 2 1 0 Scale C: Outstanding Disappointing N/A 7 6 5 4 3 2 1 0 Seven point scales are better due to the possibility of range restriction; Is there any reason to believe that scores will cluster together? Then use seven points, but look closely at the anchors in this slide.
C. Wording of Options The only difference in the preceding slide are the response anchors… Is very good a rigorous enough expectation? Would excellent be better? What about outstanding? What are your library’s aspirations?
Mini Case Study How many response points are there? What is the level of expectation? It’s not a trick question—there are two correct answers, but the better answer may be 8 because 0 would be the missing value if the question is skipped. Is “very satisfied” a rigorous enough expectation? Would delighted be better?
D. Midpoint or Neutral Point The rate of skipped questions increases when a neutral response is not included Use an odd number of response points Also, a neutral response provides a way to treat missing data If they skip it, then one make the assumption that they are Neutral on this question
Mini Case Study What’s the midpoint?
5. Number of Questions Short enough Long enough So that users will answer all the questions Long enough So that enough information is gathered for decision making purposes How many questions should we ask our users? Short surveys mean less than ten minutes
A. Longer Surveys Take more time and effort on the part of the respondent High perceived “cost of completion” results in partially or completely unanswered questions in surveys Longer means up to one hour
B. Likelihood of Complete Responses Higher salience or more important the topic to the user, the greater the likelihood that they will complete a longer survey Multiple questions measuring a single attitude make for longer surveys, although They also aid in evaluating user attitudes Remember: what’s interesting to us, may not be interesting to them Role of premiums; rewards
6. Timing and Ease During or immediately following Blurring together? Cards or mail method (IVR=interactive voice response) Delay seems to cause more positive results Electronic reference allows for ease of administration (more on PaSS™ later) Immediacy versus longer-term utility In an academic library, the end of quarter or end of the year might not be the best time.
7. Question Order Specific questions first More general second Technology, resources, or staffing More general second Value, overall satisfaction, intent to return Halo Effect Four question survey: one overall and three specific questions Asking general question last produces better data What difference does it make?
Mini Case Study Specific questions first. More general second.
8. Sample Sizes Depends upon population size Error rate Confidence Consult a table of sample sizes Please, please work on getting larger sample sizes. No creditability when read outside of our field.
A. Error Rate Defined as the precision of measurement Accurate to plus or minus some figure Has to be precise enough to know which direction service quality is going (i.e., up or down)
B. Confidence Refers to the overall confidence in the results: .99 confidence level means that one can be relatively certain that the results are within that range 99% of the time .95 confidence level is common .90 confidence level is less common, but…a 90 CL requires fewer respondents, but will result in a less accurate survey
C. Population and Sample Population (N) refers to the people of interest Sample (n) refers to the people measured to represent the population Response rate is the proportion of the population who respond to the survey
D. Population & Sample Size N= n= 100 80 200 132 500 217 1000 278 10000 370 20000 377 SOURCE: Robert V. Krejcie and Daryle W. Morgan, “Determining sample size for research activities," Educational and Psychological Measurement 30 (Autumn 1970): 607-610 Note that the required sample size does not increase at the same rate as the population. Several hundred should be your goal in most library settings… Larger the population—say 10K graduate students versus 20K undergraduates
Appropriate Sample Sizes
Case Studies Much of the extant surveying of reference service is inadequate, misleading, and can result in poor decision-making Improving user service means understanding what leads to satisfied and loyal users Patron Satisfaction Survey (PaSS)™ http://www.vrtoolkit.net/PaSS.html
Recommended Bibliographies 1,000 citations to reference studies at http://purl.org/net/reference 300 citations to virtual reference studies at http://purl.org/net/vqa
Best Single Overview Richardson, “The Current State of Research on Reference Transactions,” In Advances in Librarianship, vol. 26, pages 175-230, edited by Frederick C. Lynden. New York: Academic Press, 2002.
Recommended Readings Saxton and Richardson, Understanding Reference Transactions (2002) Most complete list of dependent and independent variables used in the study of reference service McClure et al., Statistics, Measures and Quality Standards (2002) Most complete list of measures for virtual reference work Richardson, Virtual Question Answering: Applications, Problems, Progress (forthcoming).
Questions and Answers What do you want to know now? If you want a more private response or to work with me as a consultant, I am available: jrichard@ucla.edu or jvrjvr@attglobal.net