Measurement, Control, and Stability of Multiple Response Styles Using Reverse Coded Items Eric Tomlinson Daniel Bolt University of Wisconsin-Madison Ideas.

Slides:



Advertisements
Similar presentations
Chapter 8 Flashcards.
Advertisements

Cross Cultural Research
Correlational and Differential Research
Mark D. Reckase Michigan State University The Evaluation of Teachers and Schools Using the Educator Response Function (ERF)
AN OVERVIEW OF THE FAMILY OF RASCH MODELS Elena Kardanova
Quiz Do random errors accumulate? Name 2 ways to minimize the effect of random error in your data set.
Manipulating Variables in SPSS. Changing Numerical Variables to Categorical Variables Often want/need to change variables from an ordinal, interval, or.
1 SSS II Lecture 1: Correlation and Regression Graduate School 2008/2009 Social Science Statistics II Gwilym Pryce
CH. 9 MEASUREMENT: SCALING, RELIABILITY, VALIDITY
Effect Size and Meta-Analysis
Multiple Regression Fenster Today we start on the last part of the course: multivariate analysis. Up to now we have been concerned with testing the significance.
Using Growth Models for Accountability Pete Goldschmidt, Ph.D. Assistant Professor California State University Northridge Senior Researcher National Center.
Chapter 13 Forecasting.
Explanatory Secondary Dimension Modeling of Latent Different Item Functioning Paul De Boeck, Sun-Joo Cho, and Mark Wilson.
Chapter 7 Correlational Research Gay, Mills, and Airasian
FINAL REPORT: OUTLINE & OVERVIEW OF SURVEY ERRORS
Social Science Research Design and Statistics, 2/e Alfred P. Rovai, Jason D. Baker, and Michael K. Ponton Internal Consistency Reliability Analysis PowerPoint.
Review for Final Exam Some important themes from Chapters 9-11 Final exam covers these chapters, but implicitly tests the entire course, because we use.
Item Analysis: Classical and Beyond SCROLLA Symposium Measurement Theory and Item Analysis Modified for EPE/EDP 711 by Kelly Bradley on January 8, 2013.
AN EVALUATION OF THE EIGHTH GRADE ALGEBRA PROGRAM IN GRAND BLANC COMMUNITY SCHOOLS 8 th Grade Algebra 1A.
Item Response Theory for Survey Data Analysis EPSY 5245 Michael C. Rodriguez.
Reliability, Validity, & Scaling
© 2013 Cengage Learning. Outline  Types of Cross-Cultural Research  Method validation studies  Indigenous cultural studies  Cross-cultural comparisons.
© 2011 Pearson Prentice Hall, Salkind. Introducing Inferential Statistics.
Including a detailed description of the Colorado Growth Model 1.
MEASUREMENT OF VARIABLES: OPERATIONAL DEFINITION AND SCALES
Day 6: Non-Experimental & Experimental Design
PowerPoint presentation to accompany Research Design Explained 6th edition ; ©2007 Mark Mitchell & Janina Jolley Chapter 7 Introduction to Descriptive.
Analyzing Reliability and Validity in Outcomes Assessment (Part 1) Robert W. Lingard and Deborah K. van Alphen California State University, Northridge.
A Taxonomy of Evaluation Approaches in Software Engineering A. Chatzigeorgiou, T. Chaikalis, G. Paschalidou, N. Vesyropoulos, C. K. Georgiadis, E. Stiakakis.
The Impact of Including Predictors and Using Various Hierarchical Linear Models on Evaluating School Effectiveness in Mathematics Nicole Traxel & Cindy.
User Study Evaluation Human-Computer Interaction.
Dealing with Omitted and Not- Reached Items in Competence Tests: Evaluating Approaches Accounting for Missing Responses in Item Response Theory Models.
Chapter 1: Introduction to Statistics
THE DEVELOPMENT OF AN INFORMATION- SEEKING SCALE FOR DIFFERENT DOMAINS Brock Brothers & Jennifer Vonk, Oakland University Introduction Proponents of the.
The Basics of Experimentation Ch7 – Reliability and Validity.
Applying SGP to the STAR Assessments Daniel Bolt Dept of Educational Psychology University of Wisconsin, Madison.
All Hands Meeting 2005 The Family of Reliability Coefficients Gregory G. Brown VASDHS/UCSD.
Basic Concepts of Correlation. Definition A correlation exists between two variables when the values of one are somehow associated with the values of.
Copyright © Houghton Mifflin Company. All rights reserved.Chapter 4 | 1 Measurement Turning “Conceptual” variables: –The ideas that form the basis of a.
Confirmatory Factor Analysis Psych 818 DeShon. Construct Validity: MTMM ● Assessed via convergent and divergent evidence ● Convergent – Measures of the.
Validity In our last class, we began to discuss some of the ways in which we can assess the quality of our measurements. We discussed the concept of reliability.
An Introduction to Statistics and Research Design
Measurement Models: Exploratory and Confirmatory Factor Analysis James G. Anderson, Ph.D. Purdue University.
Learning Objective Chapter 9 The Concept of Measurement and Attitude Scales Copyright © 2000 South-Western College Publishing Co. CHAPTER nine The Concept.
Survey Construction A meta-analysis of item wording and response options to reduce Bias in survey research Prepared by: Amanda Mulcahy Loyola University.
MEASUREMENT: SCALE DEVELOPMENT Lu Ann Aday, Ph.D. The University of Texas School of Public Health.
Chapter 2: Behavioral Variability and Research Variability and Research 1. Behavioral science involves the study of variability in behavior how and why.
Research Methodology and Methods of Social Inquiry Nov 8, 2011 Assessing Measurement Reliability & Validity.
Chapter 9 Correlation, Validity and Reliability. Nature of Correlation Association – an attempt to describe or understand Not causal –However, many people.
Scaling and Index Construction
University of Ostrava Czech republic 26-31, March, 2012.
Measurement Theory in Marketing Research. Measurement What is measurement?  Assignment of numerals to objects to represent quantities of attributes Don’t.
28. Multiple regression The Practice of Statistics in the Life Sciences Second Edition.
Chapter 6: Analyzing and Interpreting Quantitative Data
Psychology 307: Cultural Psychology Lecture 3
Chapter 7 Measuring of data Reliability of measuring instruments The reliability* of instrument is the consistency with which it measures the target attribute.
Psychology 202a Advanced Psychological Statistics October 22, 2015.
Likert Scaling. Work quickly and effectively under pressure 49 Organize the work when directions are not specific. 39 Decide how to manage multiple tasks.
Measurement Chapter 6. Measuring Variables Measurement Classifying units of analysis by categories to represent variable concepts.
RELIABILITY AND VALIDITY Dr. Rehab F. Gwada. Control of Measurement Reliabilityvalidity.
Chapter 8 Introducing Inferential Statistics.
Measurement of Attitude
Selecting the Best Measure for Your Study
The University of Manchester
A Different Way to Think About Measurement Development:
HLM with Educational Large-Scale Assessment Data: Restrictions on Inferences due to Limited Sample Sizes Sabine Meinck International Association.
The University of Manchester
Single-Case Designs.
Correlation and Prediction
Presentation transcript:

Measurement, Control, and Stability of Multiple Response Styles Using Reverse Coded Items Eric Tomlinson Daniel Bolt University of Wisconsin-Madison Ideas in Testing Research Seminar | October 11, 2013

Introduction and Definitions Response Styles: systematic tendencies in how respondents use the score categories of a rating scale regardless of item content. Extreme Response Style (ERS): tendency to use extreme endpoints of rating scale. Disacquiescence Response Style (DRS): tendency to select disagreement ratings to items.

Introduction and Definitions Why is it important to consider response styles? Threat to the validity of individual scale scores Bias introduced from systematic response style differences across groups of respondents (see Van Vaerenbergh and Thomas, 2012) Bias in estimation of trait correlations

IRT Model for Detection and Control of Response Styles Account for response styles through continuous latent trait variables and model how substantive and response style traits simultaneously influence response category selection (Bolt and Johnson, 2009; Bolt and Newton, 2011) j=items; k=category; K=total number of categories Estimate in LatentGold software a parameters for substantive trait (θ 1 ) fixed at equally spaced intervals (-3, -1, 1, 3) and for ERS trait (θ ERS ) at (1, -1, -1, 1). Use information criterion—AIC, BIC, AIC3, CAIC—to compare models and constraints

IRT Models for Detection and Control Appealing features of IRT models: On a conceptual level, the method considers a response selection as a function both of the substantive trait level and a response style tendency Functionally, the model can estimate both the substantive trait level(s) and response style to account for their simultaneous influence on response selection. Previously noted limitation: The Bolt & Johnson model has detected and controlled for one response style (ERS) only. Limitation largely due to data rather than of model, so we attempt to model using data with reverse coded items to statistically detect and control for multiple response styles.

TIMSS Dataset 2007 Trends in International Mathematics and Science Study Responses from 7,377 United States eighth graders 8-item “Enjoyment of Science” scale and 8-item “Enjoyment of Math” scale Response categories: 1=strongly agree 2=agree 3=disagree 4=strongly disagree Attitudinal scales similar in structure to PISA, but with 3 of 8 items in each scale being reverse worded.

TIMSS Enjoyment Scale Items Enjoyment of Science 1) I usually do well in science. 2) I would like to take more science. 3) Science is more difficult for me than for many of my classmates. 4) I enjoy learning science. 5) Science is not one of my strengths. 6) I learn things quickly in science. 7) Science is boring. 8) I like science. Enjoyment of Math 1) I usually do well in math. 2) I would like to take more math. 3) Math is more difficult for me than for many of my classmates. 4) I enjoy learning math. 5) Math is not one of my strengths. 6) I learn things quickly in math. 7) Math is boring. 8) I like math.

IRT Model with Multiple Response Style Traits We extend the model to include three distinct latent traits: θ 1 is substantive “enjoy” trait with interval-spaced a jk1 parameters θ ERS is an ERS trait with a jkERS parameters having equal positive values at endpoints and equal negative values for intermediate values θ DRS is a DRS trait with a jkDRS parameters with equal negative values for agreement responses and equal positive values for disagreement responses

Methods 1.Confirm the presence of ERS and DRS traits using model comparison statistics. 2.Estimate category intercepts and student substantive traits for “Enjoyment of Science” and “Enjoyment of Math” scales 3.Examine relative bias introduced by ERS and DRS 4.Evaluated stability of response styles across both scales 5.Compute correlations between student trait estimates and other variables (e.g. achievement)

Model Comparison Statistics One-dimensional: θ 1 trait having category slopes constrained to fixed equal interval values (-3, -1, 1, 3), and (3, 1, -1, -3) for reverse worded items. Two-dimensional: An extended case of first model adding an ERS factor with fixed category slopes (1, -1, -1, 1). Three-dimensional: An extended case of second model adding a DRS factor with fixed category slopes (-1, -1, 1, 1).

Example Response Patterns Using the three-dimensional model, we estimated category intercepts for each item as well as student substantive trait estimates for both scales. Example response patterns to the 8-item “Enjoyment of Science” scale. We assign mean θ values to 0 on all traits, and higher values of θ 1 correspond to less enjoyment.

Examination of Relative Bias

Stability of Response Styles We calculated the correlations between the ERS and DRS estimates for each respondent across both subscales as a measure of the stability of response style tendency. Science θ 1 Science θ ERS Science θ DRS Math θ 1.071** Math θ ERS.286** Math θ DRS.188** **significant at.01 level (2-tailed)

Correlations θ 1 and Subject Achievement We calculated the correlations between the student substantive trait estimates for each respondent with achievement scores in each discipline. Math Achievement Science Achievement Math Sum Score.346** Math θ 1.335** Science Sum Score.300** Science θ 1.300** **significant at.01 level (2-tailed)

Conclusions Using TIMSS data, the model statistically detected the presence of two response styles and controlled for them in estimating student substantive traits. The presence of reverse coded items on TIMSS is clearly helpful in indentifying additional response styles. Through our model, we are able to quantify the relative bias introduced by ERS and DRS, as well as the stability of the response style traits across scales. Future work is needed to evaluate the optimal design of survey instruments so as to maximize our potential in detecting response styles.