Psychometrics Timothy A. Steenbergh and Christopher J. Devers Indiana Wesleyan University.

Slides:



Advertisements
Similar presentations
Questionnaire Development
Advertisements

Chapter 8 Flashcards.
Topics: Quality of Measurements
The Research Consumer Evaluates Measurement Reliability and Validity
Types of Reliability.
© McGraw-Hill Higher Education. All rights reserved. Chapter 3 Reliability and Objectivity.
© 2006 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Validity and Reliability Chapter Eight.
Psychometrics William P. Wattles, Ph.D. Francis Marion University.
VALIDITY AND RELIABILITY
Reliability & Validity.  Limits all inferences that can be drawn from later tests  If reliable and valid scale, can have confidence in findings  If.
Part II Sigma Freud & Descriptive Statistics
Part II Sigma Freud & Descriptive Statistics
MEQ Analysis. Outline Validity Validity Reliability Reliability Difficulty Index Difficulty Index Power of Discrimination Power of Discrimination.
Methods for Estimating Reliability
Measurement. Scales of Measurement Stanley S. Stevens’ Five Criteria for Four Scales Nominal Scales –1. numbers are assigned to objects according to rules.
Reliability and Validity of Research Instruments
Part II Knowing How to Assess Chapter 5 Minimizing Error p115 Review of Appl 644 – Measurement Theory – Reliability – Validity Assessment is broader term.
Reliability n Consistent n Dependable n Replicable n Stable.
Reliability n Consistent n Dependable n Replicable n Stable.
Lecture 7 Psyc 300A. Measurement Operational definitions should accurately reflect underlying variables and constructs When scores are influenced by other.
PSYCHOMETRICS RELIABILITY VALIDITY. RELIABILITY X obtained = X true – X error IDEAL DOES NOT EXIST USEFUL CONCEPTION.
Research Methods in MIS
Classroom Assessment A Practical Guide for Educators by Craig A
Principles of language testing
Measurement and Data Quality
Reliability and Validity what is measured and how well.
Instrumentation.
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
MEASUREMENT CHARACTERISTICS Error & Confidence Reliability, Validity, & Usability.
Data Analysis. Quantitative data: Reliability & Validity Reliability: the degree of consistency with which it measures the attribute it is supposed to.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 14 Measurement and Data Quality.
McMillan Educational Research: Fundamentals for the Consumer, 6e © 2012 Pearson Education, Inc. All rights reserved. Educational Research: Fundamentals.
Psychometrics William P. Wattles, Ph.D. Francis Marion University.
The Basics of Experimentation Ch7 – Reliability and Validity.
Reliability & Validity
Tests and Measurements Intersession 2006.
Reliability & Agreement DeShon Internal Consistency Reliability Parallel forms reliability Parallel forms reliability Split-Half reliability Split-Half.
Independent vs Dependent Variables PRESUMED CAUSE REFERRED TO AS INDEPENDENT VARIABLE (SMOKING). PRESUMED EFFECT IS DEPENDENT VARIABLE (LUNG CANCER). SEEK.
Chapter 4 – Research Methods in Clinical Psych Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Research methods in clinical psychology: An introduction for students and practitioners Chris Barker, Nancy Pistrang, and Robert Elliott CHAPTER 4 Foundations.
Validity Validity: A generic term used to define the degree to which the test measures what it claims to measure.
Research Methodology and Methods of Social Inquiry Nov 8, 2011 Assessing Measurement Reliability & Validity.
Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Assessing Measurement Quality in Quantitative Studies.
Validity and Item Analysis Chapter 4. Validity Concerns what the instrument measures and how well it does that task Not something an instrument has or.
Validity and Item Analysis Chapter 4.  Concerns what instrument measures and how well it does so  Not something instrument “has” or “does not have”
Reliability n Consistent n Dependable n Replicable n Stable.
©2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
MEASUREMENT: PART 1. Overview  Background  Scales of Measurement  Reliability  Validity (next time)
Chapter 6 - Standardized Measurement and Assessment
Validity & Reliability. OBJECTIVES Define validity and reliability Understand the purpose for needing valid and reliable measures Know the most utilized.
Dr. Jeffrey Oescher 27 January 2014 Technical Issues  Two technical issues  Validity  Reliability.
Copyright © 2014 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 11 Measurement and Data Quality.
Data Collection Methods NURS 306, Nursing Research Lisa Broughton, MSN, RN, CCRN.
1 Measurement Error All systematic effects acting to bias recorded results: -- Unclear Questions -- Ambiguous Questions -- Unclear Instructions -- Socially-acceptable.
ESTABLISHING RELIABILITY AND VALIDITY OF RESEARCH TOOLS Prof. HCL Rawat Principal UCON,BFUHS Faridkot.
Professor Jim Tognolini
Measurement Reliability
Ch. 5 Measurement Concepts.
Lecture 5 Validity and Reliability
Reliability and Validity
Questions What are the sources of error in measurement?
Assessment Theory and Models Part II
Measurement: Part 1.
Test Validity.
Tests and Measurements: Reliability
Reliability & Validity
Making Sense of Advanced Statistical Procedures in Research Articles
Reliability and Validity of Measurement
PSY 614 Instructor: Emily Bullock, Ph.D.
The first test of validity
Presentation transcript:

Psychometrics Timothy A. Steenbergh and Christopher J. Devers Indiana Wesleyan University

Overview A.Psychometrics B.Classical Test Theory C.Reliability D.Validity

A. Psychometrics Psychological measurement Reliability Validity Tests Items (Jones & Thissen, 2007; Kaplan & Saccuzzo, 2012)

B. Classical Test Theory Foundation for Reliability (Kline, 2005)

For those who like pictures…

Proportion of True to Observed Score BDI Score (X) Depression Level (True Score) Observed Score

BDI Score (X) Measurement Error (E) Depression Level (True Score)

Adding it up… Depression Level (True score) Error Depression Level + Measurement Error Observed Score

C. Reliability What does it mean to be reliable? Consistency of scores over time, across test forms, or across variable testing conditions Types of Reliability Test-Retest Inter-item (internal) Inter-rater (Anastasi, 1988)

C.1. Test-Retest Reliability Are test scores stable over time? Give test to same group at 2 points in time and correlate test scores Must consider stability of construct when establishing test-retest interval interpreting test-retest correlation

C.2. Internal (inter-item) Consistency Assumption: A composite score has to be made up of items that are measuring the same phenomenon Heterogenous items will produce a lower internal consistency reliability coefficient Measures of internal consistency: Split Half Cronbach’s Alpha (coefficient α) Kuder Richardson-20 (KR20; for dichotomous items) (Pedhazur & Schmelkin, 1991)

Interpreting Reliability Coefficients What is a reasonable level of reliability? Research ≥.80 Clinical ≥.90 Factors to consider when evaluating a reliability coefficient: Stability of construct Dimensional nature of construct (uni- vs. multi-) Number of items (short tests are less reliable)

C.3. Inter-Rater Reliability Accuracy (consistency) with which different raters arrive at the same scores Extremely important for tests that require any rater judgment (eg, WAIS vocabulary) Agreement is computed with Kappa statistic Ranges from K = 1.0  perfect agreement, 0  chance agreement, -1.0  less than chance agreement “fair” >.75 “excellent” (Fleiss, 1981)

D. Validity If something is valid, what does that mean? Validity: degree to which a test measures that which it purports to measure Types Content Criterion-related Construct

D.1. Content Validity How well does the instrument sample from the domain of interest? Lack of adequate item sampling can lead to invalid findings Examples GBQ (see p. 144 of article) WAIS Assess with Expert raters

D.2. Criterion-Related Validity Does the test score correlate with other measures as we would expect? Concurrent validity: test score relates to a criterion measured at the same time Predictive validity: test score predicts a future criterion Validity coefficient: correlation coefficient between test score and criterion measure

D.3. Construct Validity Is there evidence that the measure adequately assesses the construct of interest? Do test scores change over time or as a result of certain events, as theorized? Are items homogeneous, or do certain items “hang together?” (Factor Analysis)

Factor Analysis Statistical method for examining underlying constructs (latent traits) within a test Uses correlation matrices to identify underlying relationships among test items Example: GBQ

Overview Psychometrics Psychological measurement Classical Test Theory Reliability Test-Retest Inter-item (internal) Inter-rater Validity Content Criterion-related Construct (Trochim, 2006)

Resources Software SPSS PSPP R Videos Educator.com CLI: Research Seminars Andy Field Websites Social Research Methods Institute for Digital Research and Education Statistics Help for Students Stat Pages

References Anastasi, A. (1988). Psychological testing (6th ed.). New York, NY: MacMillan. Fleiss, J. L. (1981). Statistical methods for rates and proportions (2nd ed.). New York, NY: John Wiley & Sons. Jones, L. V., & Thissen, D. (2007). A history and overview of psychometrics. Handbook of statistics, 26, Kaplan, R., & Saccuzzo, D. (2012). Psychological testing: Principles, applications, and issues. Belmont, CA: Cengage Learning. Kline, T. J. B. (2005). Classical test theory: Assumptions, equations, limitations, and item analyses. In T. J. B. Kline, Psychological testing: A practical approach to design and evaluation (pp ). Thousand Oaks, CA: Sage. Pedhazur, E. J., & Schmelkin, L. P. (1991). Measurement, design and analysis: An integrated approach. Hillsdale, NJ: Lawrence Earlbaum. Trochim, W. M. K. (2006). Reliability and validity. Retrieved from

Questions EdProfessor.com