The Delivery Matters: Examining Reactivity in Question Answering Riley E. Foreman, Marshall T. Beauchamp, and Erin M. Buchanan Missouri State University.

Slides:



Advertisements
Similar presentations
Tessa Peasgood Centre for Well-being in Public Policy Sheffield University Modelling Subjective Well- being. Do strong social relations lead to increases.
Advertisements

Reliability and Validity of Researcher-Made Surveys.
Student Survey Results and Analysis May Overview HEB ISD Students in grades 6 through 12 were invited to respond the Student Survey during May 2010.
Mark Troy – Data and Research Services –
Psychometrics William P. Wattles, Ph.D. Francis Marion University.
Reliability for Teachers Kansas State Department of Education ASSESSMENT LITERACY PROJECT1 Reliability = Consistency.
RECAP – TASK 1 What is utilitarianism? Who is Jeremy Bentham?
Some Problems evaluating large programs or program going in many different directions 1. the objectives are too general; they are really goals many.
Measurement Reliability and Validity
Scaling Session Measurement implies “assigning numbers to objects or events…” Distinguish two levels: we can assign numbers to the response levels for.
© Cengage Learning – Purchasing & Supply Chain Management 4 ed ( ) Practice 16. Negotiating techniques and rules of conduct.
Quiz Do random errors accumulate? Name 2 ways to minimize the effect of random error in your data set.
Validity In our last class, we began to discuss some of the ways in which we can assess the quality of our measurements. We discussed the concept of reliability.
Research Methods in Psychology
EBI Statistics 101.
Teen Health Perspective Results “Honestly, most issues are mental like anxiety, stress, worry, and over thinking. They do all not need to be treated with.
Effect Size and Meta-Analysis
III Choosing the Right Method Chapter 10 Assessing Via Tests p235 Paper & Pencil Work Sample Situational Judgment (SJT) Computer Adaptive chapter 10 Assessing.
Measurement in Survey Research Developing Questionnaire Items with Respect to Content and Analysis.
Item Analysis What makes a question good??? Answer options?
VALIDITY.
How Does A Student’s Schedule Affect Their Learning? Chapter 5 Discussion Dean Papadakis SED 697.
Assessing cognitive models What is the aim of cognitive modelling? To try and reproduce, using equations or similar, the mechanism that people are using.
Incomplete Block Designs
Variables cont. Psych 231: Research Methods in Psychology.
1 The Ford Wheel Results of pilot 1 Various forms of analysis were carried out on a large data set Significant results occur by chance about fifty times.
Stand-Out Sport Athletes’ Attitudes Toward Physical Education Timothy M. Church Department of Physical Education and Health Education INTRODUCTION Assumptions.
Chapter 7 Correlational Research Gay, Mills, and Airasian
Ways to Utilize the 2012 FCPS Working Conditions Survey April 11, 12, 13 Laurie Fracolli, Sid Haro, and Andrew Sioberg.
Validity and Validation: An introduction Note: I have included explanatory notes for each slide. To access these, you will probably have to save the file.
8/7/2015Slide 1 Simple linear regression is an appropriate model of the relationship between two quantitative variables provided: the data satisfies the.
Social Science Research Design and Statistics, 2/e Alfred P. Rovai, Jason D. Baker, and Michael K. Ponton Internal Consistency Reliability Analysis PowerPoint.
Measurement Concepts & Interpretation. Scores on tests can be interpreted: By comparing a client to a peer in the norm group to determine how different.
Types of Formal Reports Chapter 14. Definition  Report is the term used for a group of documents that inform, analyze or recommend.  We will categorize.
Indexes, Scales, and Typologies
Example 10.1 Experimenting with a New Pizza Style at the Pepperoni Pizza Restaurant Concepts in Hypothesis Testing.
Descriptive and Causal Research Designs
Reliability, Validity, & Scaling
Business and Management Research
AFT 7/12/04 Marywood University Using Data for Decision Support and Planning.
Student Engagement Survey Results and Analysis June 2011.
Instrumentation.
Psychometrics William P. Wattles, Ph.D. Francis Marion University.
Students’ and Faculty’s Perceptions of Assessment at Qassim College of Medicine Abdullah Alghasham - M. Nour-El-Din – Issam Barrimah Acknowledgment: This.
Capturing the Student Perspective: Advising at Missouri State University Marilee L. Teasley & Dr. Erin M. Buchanan, Department of Psychology Abstract When.
Teacher Engagement Survey Results and Analysis June 2011.
Validity In our last class, we began to discuss some of the ways in which we can assess the quality of our measurements. We discussed the concept of reliability.
2008 FAEIS Annual Longitudinal Assessment With a Comparison to the 2007 Survey Results The purpose of the FAEIS annual evaluation is to develop longitudinal.
Table 2: Correlation between age and readiness to change Table 1: T-test relating gender and readiness to change  It is estimated that 25% of children.
Psychometrics & Validation Psychometrics & Measurement Validity Properties of a “good measure” –Standardization –Reliability –Validity A Taxonomy of Item.
Unit 5: Improving and Assessing the Quality of Behavioral Measurement
Question Everything.  Questionnaire should be: ◦ Valid – Questions should measure what was meant to be measured ◦ Reliable – Should give you the same.
Research Methodology and Methods of Social Inquiry Nov 8, 2011 Assessing Measurement Reliability & Validity.
Experimental Research Methods in Language Learning Chapter 5 Validity in Experimental Research.
Transitions Bridges between ideas and supporting points.
©2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
1 Evaluating the User Experience in CAA Environments: What affects User Satisfaction? Gavin Sim Janet C Read Phil Holifield.
Adult Consumer Assessments of Care in New York Chip Felton, Senior Deputy Commissioner Jeff Kirk Doug Dornan New York State Office of Mental Health Center.
LISA A. KELLER UNIVERSITY OF MASSACHUSETTS AMHERST Statistical Issues in Growth Modeling.
Psychometrics: Exam Analysis David Hope
© Copyright  People at Work Project - Overview  People at Work Project - Theoretical Underpinnings  People at.
By Dr Hidayathulla Shaikh. Objectives  At the end of the lecture student should be able to –  Define survey  Mention uses of survey  Discuss types.
Strategies for Differentiated Teaching and Learning Online EDU 673.
DESIGNING GOOD SURVEYS Laura P. Naumann Assistant Professor of Psychology Nevada State College.
MEASUREMENT: RELIABILITY AND VALIDITY
III Choosing the Right Method Chapter 10 Assessing Via Tests
Business and Management Research
Mohamed Dirir, Norma Sinclair, and Erin Strauts
Business and Management Research
FIXED VS GROWTH MINDSET QUESTIONNAIRE
Presentation transcript:

The Delivery Matters: Examining Reactivity in Question Answering Riley E. Foreman, Marshall T. Beauchamp, and Erin M. Buchanan Missouri State University Abstract The Importance of Scale Psychometrics In recent history, there has been a greater propensity to administer surveys through electronic means as opposed to pencil and paper (Buchanan & Smith, 1999). If this shift is to occur, it must be ensured that internet surveys are either more reliable or equally reliable when compared to concrete, physical surveys. Studies have been performed to determine equivalence and have found computer-based surveys to be comparable to their paper counterparts (Weigold, 2013). Discussion Riley E. Foreman: Erin M. Buchanan: Overall, paper and computer administrations showed differences in average scores, item correlations, and percent change on both versions of the questionnaire. Continuous Measures: Paper and computer showed similar average scores, but randomizing the questions lowered item averages. Paper administration showed the lowest item correlations, indicating potential item reactivity. However, non-randomized computer administrations had higher item correlations than randomized computer administrations, indicating that randomization does not affect reactivity in the way hypothesized. Categorical Measures: Paper and computers showed different average question scores, along with randomized and not randomized administrations. Paper questionnaires had lower percent change than computers, and randomized computer versions were lower than non- randomized administrations, which was a surprising finding that might be explained by different answering options. Limitations: This analysis cannot distinguish between positive and negative reactivity, which may explain differences across scales. No randomized paper administration. Contact The Life Purpose Questionnaire Analysis The Purpose in Life Test (Crumbaugh & Maholick, 1964) 1I am usually: (1-completely bored; 7-exuberant, enthusiastic) 2Life to me seems: (1-completely routine; 7-always exiting)* 3In life I have: (1-no goals or aims at all; 7-very clear goals and aims) 4My personal existence is: (1-utterly meaningless, without purpose; 7-very purposeful and meaningful) 5Every day is: (1-exactly the same; 7-constantly new)* 6If I could choose, I would: (1-prefer never to have been born; 7-like nine more lives just like this one) 7After retiring, I would: (1-loaf completely the rest of my life; 7-do some of the exciting things I have always wanted to)* 8In achieving life goals I have: (1-made no progress whatever; 7-progressed to complete fulfillment) 9My life is: (1-empty, filled only with despair; 7-running over with exciting good things) 10If I should die today, I would feel that my life has been: (1-completely worthless; 7-very worthwhile)* 11In thinking of my life, I: (1-often wonder why I exist; 7-always see a reason for my being here) 12As I view the world in relation to my life, the world: (1-completely confuses me; 7-fits meaningfully with my life) 13I am a: (1-very irresponsible person; 7-very responsible person) 14Concerning man’s freedom to make his own choices, I believe man is: (1-completely bound by limitations of heredity and environment; 7-absolutely free to make all life choices)* 15With regard to death, I am: (1-unprepared and frightened; 7-prepared and unafraid)* 16With regard to suicide, I have: (1-thought of it seriously as a way out; 7-never given it a second thought) 17I regard my ability to find a meaning, purpose or mission in life as: (1-practically none; 7-very great)* 18My life is: (1-out of my hands and controlled by external factors; 7-in my hands and I am in control of it)* 19Facing my daily tasks is: (1-a painful and boring experience; 7- a source of pleasure and satisfaction) 20I have discovered: (1-no mission or purpose in life; 7-clear-cut goals and a satisfying life purpose Note. The Life Purpose Questionnaire (Hutzell, 1986) is a modified version of the PIL with a dichotomous Agree – Disagree format. All questions are scored wherein high scores indicate higher meaning/purpose in life. The Purpose in Life Test Analysis Data Analyzed DatasetNMode of Administration Randomized Dinkel633ComputerNo MSU ComputerNo MSU ComputerYes MSU Paper/PencilNo Logotheraphy341Paper/PencilNo Meaning in Life298Paper/PencilNo SES265Paper/PencilNo UM Paper/PencilNo Scales that are psychometrically sound – meaning those that meet established standards regarding reliability and validity while measuring one or more constructs of interest – are customarily evaluated based on a set modality of delivery (i.e. via the Internet, handwritten, etc.) and administration (fixed item order). Deviating from an established administration profile could result in non-equivalent response patterns, indicating the possible evaluation of a dissimilar construct. Furthermore, item grouping may influence response patterns. Randomizing item administration may alter or eliminate these effects. Therefore, we examined the differences in scale relationships for computer versus handwritten scale delivery for two scales measuring meaning/purpose in life. These scales have questions about suicidality, depression, and life goals that may cause reactivity (i.e. a changed response to a second item based on the answer to the first item). Paper v Computer (All): t(19) = 1.33, p =.20, d = 0.30 Paper v Computer (Not Random): t(19) = 0.15, p =.88, d = 0.03 Random v Not (Computer): t(19) = 3.57, p <.01, d = 0.80 Random v Not (All): t(19) = 5.41, p <.001, d = 0.83 Paper v Computer (All): t(18) = 3.19, p <.01, d = 0.73 Paper v Computer (Not Random): t(18) = 3.97, p <.01, d = 0.91 Random v Not (Computer): t(18) = 3.88, p <.01, d = 0.89 Random v Not (All): t(18) = 0.02, p =.98, d = 0.01 Paper v Computer (All): t(19) = 7.19, p <.001, d = 1.61 Random v Not (Computer): t(19) = 4.83, p <.001, d = 1.08 Paper v Computer (All): t(18) = 4.73, p <.001, d = 1.09 Random v Not (Computer): t(19) = 5.42, p <.001, d = 1.24 Hypotheses Paper and computer measures should have similar item correlations, average scores, and percent change values. Continuous measures: reactivity is indicated by lower item correlations. Categorical measures: reactivity is indicated by higher percentage change.