Sarah E. Stegall, Darrin L. Rogers, Emanuel Cervantes
The Semantic Inconsistency Scale (SIS), a no- cost tool for measuring random responding in questionnaire research, was developed and validated in two independent samples. It shows strong initial evidence of validity, able to not only detect computer-generated random responses but also invalid responding caused by more realistic conditions.
Invalid responding can threaten validity and interpretation (Huang, 2012; however, see Costa & McCrae, 1997 for alternative views). Random responding (RR; Archer & Smith, 2008) scales measure participants consistency of to pairs of items with similaror oppositemeanings (e.g., MMPI2 scale VRIN; Butcher et al., 2001; PAI scale INC; Morey, 2007).
Methods previously used to develop and evaluate RR scales include: Comparing responses from participants instructed to answer questionnaires randomly with subjects given standard instructions (Berry et al., 1991; Cramer, 1995; Galen & Berry, 1996) and Comparing real responses with computer-generated random responses (Charter & Lopez, 2003) Real-world More ecologically valid manipulations more externally valid Not been used so far
Commercially-marketed assessments only Semantic inconsistency scale (SIS) Public-domain measure of RR for use with questionnaires
Participants: 482 undergraduate students 75% female, 25% male 95% Hispanic Data Collection Phases Phase 1 (February-July, 2012): N=286, 75% female. Phase 2 (August-December, 2012): N=196, 81% female.
Procedures & Materials: Anonymous online survey Big Five Inventory (BFI; John & Srivastava, 1999) SIS item pool I see myself as someone who…
30 pairs of items From International Personality Item Pool (Goldberg et al., 2006)
Judged to be semantically related Very similar in meaning Apparently opposite in meaning
Degree of inconsistency in responses RR Needs a push to get started Finds it difficult to get down to work. Spends time thinking about past mistakes Doesnt worry about things that have already happened. I see myself as someone who… Strongly Disagree Disagree Neither Agree Nor Disagree Agree Strongly Agree Strongly Disagree Disagree Neither Agree Nor Disagree Agree Strongly Agree *Note: reverse coded* Difference of 1 Difference of 3 Similar items Opposite items
Experimental Manipulation: Quick condition (Q or quick) Subtly encouraged to complete the task quickly In-test messages emphasized importance of students time Control condition (A or accurate) Instructed to complete the survey accurately In-test messages emphasized accuracy
Phase 1 Selection and validation of final item pairs Maximized correlations Resulting 22-item (11-pair) SIS Phase 2 SIS scale assessed using responses SIS score = mean discrepancy in SIS pairs (possible range: 0-4)
Q vs. A comparison Survey completion time Attention to survey content Real vs. random responses All results calculated on Phase 2 sample only (unless otherwise specified)
Median SIS scores Q > A (Wilcoxon test z=2.179, p<.05). Figure 1. Trimmed (20%) means for SIS scores in condition A (accurate) versus Q (quick).
Correlation SIS scores Time to complete the full survey Spearmans rho = -.13 (p =.06)
Multiple choice questions content of survey items they had just seen & responded to Number of questions answered incorrectly Prediction: positive correlation with SIS No association Spearmans rho =.04 (p >.05)
SIS discrimination between 100% random responding (computer-generated) Actual participant responses Compare Phase 2 responses to 100,000 records of randomly-generated responses. Score SIS on everything Real scores < Random-response scores (t=31.56, p<.001; Figure 2).
Figure 2. Distribution of true Phase 2 SIS scores (blue) versus randomly-generated profiles (red).
SIS sensitivity of discrimination between True Phase 2 records Equal number of randomly-generated records Receiver-Operator Characteristic (ROC) analysis Area under the curve (AUC) discrimination ability of the test AUC =.95 (excellent discrimination ability)
Figure 3. ROC analysis for Phase 2 responses vs. (100%) randomly- generated response records.
1. Dataset split in half randomly Control group: original (real) responses Random group: X% of responses replaced with random Randomly-selected X% of responses X goes from 1% to 100% (i.e., do this process 100 times) Control Group Original (real) Responses Random Group X% replaced with random 1% < X < 100%
2. SIS scored & AUC calculated SIS discrimination between Control & Random groups
3. Result: SIS discrimination between real and partial (from 1 to 100%) random responding 4. We repeated this entire process 100 times, to even out random selection
Each run: AUCs comparing real responses to real + partial random 0% to 100% random Mean of 100 AUCs at each point
Figure 4. AUCs for 100 runs of SIS discrimination between original profiles and partially (1% through 100%) random profiles. Light blue lines are AUCs for 100 individual runs; dark blue line indicates mean AUC at each point.
Semantic Inconsistency Scale (SIS) Phase 1: Scale development (22-item/11-pair) Phase 2: Validated Identification of random responding Excellent with 100% random responses Fair performance even with protocols having less than 20% random responding. Discriminate between Quick & Accurate: Participants primed and instructed to answer hastily Participants given regular instructions
Perform as well as (if not better than) comparable tests Easily inserted into a variety of psychological and personality tests Modification of item stems or formats may allow use with an even wider range.
Limitations and Future Directions: Not appropriate for all test varieties Very short research Clinical protocols Random responding is not always a problem Depends on clinical/research situation SIS might help you know whether it is
SIS = Robust and valid measure of random responding FREE: Creative Commons licensed
Archer, R. P., & Smith, S. R. (2008). Personality assessment. CRC Press. Berry, D. R., Wetter, M. W., Baer, R. A., Widiger, T. A., Sumpter, J. C., Reynolds, S. K., & Hallam, R. A. (1991). Detection of random responding on the MMPI-2: Utility of F, back F, and VRIN scales. Psychological Assessment: A Journal Of Consulting And Clinical Psychology, 3(3), doi: / Butcher, J. N., Graham, J. R., Ben-Porath,Y. S., Tellegen, A., Dahlstrom,W. G.,&Kaemmer, B. (2001). MMPI-2 (Minnesota Multiphasic PersonalityInventory-2): Manual for administration and scoring (2nd ed.). Minneapolis, MN: University of Minnesota Press. Charter, R. A., & Lopez, M. N. (2003). MMPI 2: Confidence intervals for random responding to the F, F Back, and VRIN scales. Journal of clinical psychology, 59(9), Costa Jr., P. T., & McCrae, R. R. (1997). Stability and Change in Personality Assessment: The Revised NEO Personality Inventory in the Year Journal Of Personality Assessment, 68(1), 86. Cramer, K. M. (1995). Comparing three new MMPI-2 randomness indices in a novel procedure for random profile derivation. Journal of personality assessment, 65(3), Gallen, R. T., & Berry, D. R. (1996). Detection of random responding in MMPI-2 protocols. Assessment, 3(2), Goldberg, L. R., Johnson, J. A., Eber, H. W., Hogan, R., Ashton, M. C., Cloninger, C. R., & Gough, H. G. (2006). The international personality item pool and the future of public-domain personality measures. Journal of Research in Personality, 40(1), Huang, J. L., Curran, P. G., Keeney, J., Poposki, E. M., & DeShon, R. P. (2012). Detecting and deterring insufficient effort responding to surveys. Journal of Business and Psychology, 27(1), John, O. P., & Srivastava, S. (1999). The Big Five trait taxonomy: History, measurement, and theoretical perspectives. Handbook of personality: Theory and research, 2, Morey, L. C. (2007). Personality assessment inventory (PAI).
Sarah Stegall: Darrin Rogers: Emanuel Cervantes: