Unidimensionality (U): What’s it good for in applied research?

Slides:



Advertisements
Similar presentations
Elliott / October Understanding the Construct to be Assessed Stephen N. Elliott, PhD Learning Science Institute & Dept. of Special Education Vanderbilt.
Advertisements

Standardized Scales.
Psychometric Aspects of Linking Tests to the CEF Norman Verhelst National Institute for Educational Measurement (Cito) Arnhem – The Netherlands.
Part II Sigma Freud & Descriptive Statistics
1 Content-based Interpretations of Test Scores Michael Kane National Conference of Bar Examiners Maryland Assessment Research Center for Education Success.
Item Response Theory in Health Measurement
Skills Diagnosis with Latent Variable Models. Topic 1: A New Diagnostic Paradigm.
Part II Knowing How to Assess Chapter 5 Minimizing Error p115 Review of Appl 644 – Measurement Theory – Reliability – Validity Assessment is broader term.
New Hampshire Enhanced Assessment Initiative: Technical Documentation for Alternate Assessments Standard Setting Inclusive Assessment Seminar Marianne.
King Saud University College of nursing Master program.
Item Response Theory. Shortcomings of Classical True Score Model Sample dependence Limitation to the specific test situation. Dependence on the parallel.
Florida Assessments for Instruction in Reading aligned to the Language Arts Florid Standards FAIR-FS Purpose Presented by Mrs. DeSousa.
© UCLES 2013 Assessing the Fit of IRT Models in Language Testing Muhammad Naveed Khalid Ardeshir Geranpayeh.
Chapter 9 Flashcards. measurement method that uses uniform procedures to collect, score, interpret, and report numerical results; usually has norms and.
Chapter 14 Inferential Data Analysis
Psychometrics Timothy A. Steenbergh and Christopher J. Devers Indiana Wesleyan University.
Chapter 3 Needs Assessment
Assessment for teaching Presented at the Black Sea Conference, Batumi, September 12, Patrick Griffin Assessment Research Centre Melbourne Graduate.
Item Response Theory for Survey Data Analysis EPSY 5245 Michael C. Rodriguez.
Assessment Report Department of Psychology School of Science & Mathematics D. Abwender, Chair J. Witnauer, Assessment Coordinator Spring, 2013.
Modern Test Theory Item Response Theory (IRT). Limitations of classical test theory An examinee’s ability is defined in terms of a particular test The.
Analyzing Reliability and Validity in Outcomes Assessment (Part 1) Robert W. Lingard and Deborah K. van Alphen California State University, Northridge.
Evaluating Measurement Equivalence between Hispanic and Non-Hispanic Responders to the English Form of the HINTS Information SEeking Experience (ISEE)
Visions for the Future: Inclusive Assessments Jacqueline F. Kearns, Ed.D. University of Kentucky.
Illustration of a Validity Argument for Two Alternate Assessment Approaches Presentation at the OSEP Project Directors’ Conference Steve Ferrara American.
Survey of alternative uses of ACTIVE data, review of completed work and lessons learned Friday Harbor Advanced Psychometrics Workshop June 9-13, 2014 Presenter:
Separation of Longitudinal Change from Re-Test Effect using a Multiple-Group Latent Growth Model Richard N. Jones, John N. Morris, Adrienne N. Rosenberg,
Chapter 4 Understanding Student Differences Viewing recommendations for Windows: Use the Arial TrueType font and set your screen area to at least 800 by.
Experimental Research Methods in Language Learning Chapter 5 Validity in Experimental Research.
 An article review is written for an audience who is knowledgeable in the subject matter instead of a general audience  When writing an article review,
Reliability performance on language tests is also affected by factors other than communicative language ability. (1) test method facets They are systematic.
Chapter 6 - Standardized Measurement and Assessment
Jamal Abedi, UCLA/CRESST Major psychometric issues Research design issues How to address these issues Universal Design for Assessment: Theoretical Foundation.
Item Response Theory Dan Mungas, Ph.D. Department of Neurology University of California, Davis.
1 Collecting and Interpreting Quantitative Data Deborah K. van Alphen and Robert W. Lingard California State University, Northridge.
The Invariance of the easyCBM® Mathematics Measures Across Educational Setting, Language, and Ethnic Groups Joseph F. Nese, Daniel Anderson, and Gerald.
Latent Variable Modeling of Cognitive Reserve Richard N. Jones, Sc.D. Friday Harbor Advanced Psychometrics Workshop 2009.
Instrument Development and Psychometric Evaluation: Scientific Standards May 2012 Dynamic Tools to Measure Health Outcomes from the Patient Perspective.
Knowing What Students Know Ganesh Padmanabhan 2/19/2004.
WHAT IS THE NATURE OF SCIENCE?
Introduction to ASCQ-MeSM
Friday Harbor Laboratory University of Washington August 22-26, 2005
Chapter 2 Sociological Research Methods.
The 32nd Forum for Behavioral Science in Family Medicine
Assessing the Quality of Instructional Materials: Item Response Theory
Classroom Assessment Validity And Bias in Assessment.
Wednesday October 29 and Friday October 31
Connected Mathematics Program
The Scientific Method in Psychology
Introduction to ASCQ-Me®
Chapter 1: Introduction to Scientific Thinking
Queues Chapter 8 (continued)
Types of Research 24TH April 2018 Shellemiah Keya
Analyzing Reliability and Validity in Outcomes Assessment Part 1
Building a Strong Outcome Portfolio
National Conference on Student Assessment
Introduction to ASCQ-Me®
Randomization: A Missing Component of the Single-Case Research Methodological Standards Joel R. Levin University of Arizona Adapted from Kratochwill, T.
Webinar Series Objectives:
Program Assessment Plans Step by Step
Essay.
The Technology Integration Planning Model
Detecting Differential Item Functioning using Mplus
Psychosocial Dimension of Behavioral Health
Psychometrics Working Group Friday Harbor Laboratories
Research Problem: The research problem starts with clearly identifying the problem you want to study and considering what possible methods will affect.
CHAPTER 10 Comparing Two Populations or Groups
Multitrait Scaling and IRT: Part I
Item Analysis: Classical and Beyond
Item Response Theory Applications in Health Ron D. Hays, Discussant
Presentation transcript:

Unidimensionality (U): What’s it good for in applied research? Richard N. Jones, Sc.D. Hebrew Rehabilitation Center for Aged Research and Training Institute Boston, MA jones@mail.hrca.harvard.edu Paul K. Crane M.D., M.P.H. University of Washington Seattle, WA pcrane@u.washington.edu Psychometrics Workshop September 20-25, 2004

Overview of this talk Introduction to concept of Unidimensionality (U) Appeal to authority What pioneers and prominent experts and authors and others have to say about U Addressing the problem of U empirically Discussion

Issues Around Unidimensionality What does U mean? Why is U important? What is the test for? What is the analysis for? DIF studies Item banking Can U be determined empirically? What to if you don’t think you have a U test? How to respond to a reviewer with concerns about U?

Other Sources Gould, S. (1981). The Mismeasure of man. New York: WW Norton & Company. Lord, F. (1980). Applications of item response theory to practical testing problems. Hillsdale, NJ: Lawrence Erlbaum Associates, Publishers. Lord, F., & Novick, M. (1968). Latent traits and item characteristic functions (Chapter 16). In Statistical Theories of Mental Test Scores (pp. 358-393). Reading, MA: Addison-Wesley. Stout, W. (2002). Psychometrics: From practice to theory and back: 15 years of nonparametric multidimensional IRT, DIF/test equity, and skills diagnostic assessment. Psychometrika, 67(4), 485-518. McDonald, R. (1999). Test Theory: A Unified Treatment. Mahwah, NJ: Erlbaum. Thissen, D. (2004) Comments on Developing Tailored Instruments: Item Banking and Computer Adaptive Assessment. Conference Proceedings (Advances in Health Outcomes Measurement). National Institutes of Health. Bethesda, Maryland. http://outcomes.cancer.gov/conference/irt/thissen.pdf

Lord and Novick (’68) on U Actual existence of traits not necessary for IRT ...nowhere is there any implication that traits exist in any physical or physiological sense. It is sufficient that a person behave as if he [or she] were in possession of a certain amount of each of a number of relevant traits and that he [or she] behave as if these amounts determined his [or her] behavior. (p 358)

Embertson and Reise (‘00) on U U IRT models = a single latent trait is sufficient to characterize individual differences, for example Single common factor Multiple factors proportionally loading in items Not appropriate when Multiple factors load differentially in different items Persons (groups) differ in factors that load in items (e.g., test-taking strategies, interpretations of items, substantive factors that load in items) Mathematics ability and reading ability in written math items Working memory and concentration in certain cognitive tests

Stout’s Essential U U is, essentially, local independence Strong local independence Probability of responding u is independent of other test item responses, conditional on q Weak local independence The conditional covariance (controlling for some trait q) for all test items is 0 Essential Unidimensionality The sum of the conditional covariance (controlling for some latent trait q) of all pairs of test items from a test approaches 0 as the test length approaches infinity Stout, W. (2002). Psychometrics: From practice to theory and back: 15 years of nonparametric multidimensional IRT, DIF/test equity, and skills diagnostic assessment. Psychometrika, 67(4), 485-518.

Jones on U Factor Analysis solutions are sample dependent Think about your data Use parcels/testlets/composites where appropriate and possible Understand logical dependencies among test items Look at model residuals Use an empirical device to support an argument for U Not one is universally accepted Argument for U is sample dependent Sample of persons Sample of behaviors Model misspecification (Data-driven versus a priori latent trait models can lead to different conclusions)

McDonald’s Device for U ‘Hierarchical’ Factor analysis Aka General-specific factor model Bi-factor model Others? The hierarchical model provides useful evidence of U

McDonald argues that evidence for unidimensionality is supported by observing that the loadings of the items in the general factor are larger than the loadings of the in the specific factors.