Maryland Validity ConferenceSlide 1October 10, 2008 Validity from the Perspective of Model-Based Reasoning Robert J. Mislevy Measurement, Statistics and.

Slides:



Advertisements
Similar presentations
Psychometrics to Support RtI Assessment Design Michael C. Rodriguez University of Minnesota February 2010.
Advertisements

Advances in the PARCC Mathematics Assessment August
Critical Thinking Course Introduction and Lesson 1
Copyright © Allyn & Bacon (2007) Research is a Process of Inquiry Graziano and Raulin Research Methods: Chapter 2 This multimedia product and its contents.
Skills Diagnosis with Latent Variable Models. Topic 1: A New Diagnostic Paradigm.
THE VISION OF THE COMMON CORE: EMBRACING THE CHALLENGE UCDMP SATURDAY SERIES SECONDARY SESSION 5 MAY 3, 2014.
Part II Knowing How to Assess Chapter 5 Minimizing Error p115 Review of Appl 644 – Measurement Theory – Reliability – Validity Assessment is broader term.
Robert J. Mislevy & Min Liu University of Maryland Geneva Haertel SRI International Robert J. Mislevy & Min Liu University of Maryland Geneva Haertel SRI.
SRI Technology Evaluation WorkshopSlide 1RJM 2/23/00 Leverage Points for Improving Educational Assessment Robert J. Mislevy, Linda S. Steinberg, and Russell.
University of Maryland Slide 1 May 2, 2001 ECD as KR * Robert J. Mislevy, University of Maryland Roy Levy, University of Maryland Eric G. Hansen, Educational.
University of Maryland Slide 1 July 6, 2005 Presented at Invited Symposium K3, “Assessment Engineering: An Emerging Discipline” at the annual meeting of.
SLRF 2010 Slide 1 Oct 16, 2010 What is the construct in task-based language assessment? Robert J. Mislevy Professor, Measurement, Statistics and Evaluation.
Item Response Theory. Shortcomings of Classical True Score Model Sample dependence Limitation to the specific test situation. Dependence on the parallel.
CILVR 2006 Slide 1 May 18, 2006 A Bayesian Perspective on Structured Mixtures of IRT Models Robert Mislevy, Roy Levy, Marc Kroopnick, and Daisy Wise University.
LTRC 2007 Messick Address Slide 1 June 9, 2007 Toward a Test Theory for the Interactionalist Era Robert J. Mislevy University of Maryland Samuel J. Messick.
(1) If Language is a Complex Adaptive System, What is Language Assessment? Presented at “Language as a Complex Adaptive System”, an invited conference.
Developing Ideas for Research and Evaluating Theories of Behavior
The end of construct validity
Uses of Language Tests.
AERA 2010 Robert L. Linn Lecture Slide 1 May 1, 2010 Integrating Measurement and Sociocognitive Perspectives in Educational Assessment Robert J. Mislevy.
Principles of High Quality Assessment
FERA 2001 Slide 1 November 6, 2001 Making Sense of Data from Complex Assessments Robert J. Mislevy University of Maryland Linda S. Steinberg & Russell.
Chapter 9 Flashcards. measurement method that uses uniform procedures to collect, score, interpret, and report numerical results; usually has norms and.
Measuring Social Life Ch. 5, pp
1 New York State Mathematics Core Curriculum 2005.
Mathematics the Preschool Way
Item Response Theory for Survey Data Analysis EPSY 5245 Michael C. Rodriguez.
At the end of my physics course, a biology student should be able to…. Michelle Smith University of Maine School of Biology and Ecology Maine Center for.
© 2013 Cengage Learning. Outline  Types of Cross-Cultural Research  Method validation studies  Indigenous cultural studies  Cross-cultural comparisons.
Reliability and factorial structure of a Portuguese version of the Children’s Hope Scale José Tomás da Silva Maria Paula Paixão Catarina Carvalho dos Santos.
Crosscutting Concepts and Disciplinary Core Ideas February24, 2012 Heidi Schweingruber Deputy Director, Board on Science Education, NRC/NAS.
Reasoning Abilities Slide #1 김 민 경 Reasoning Abilities David F. Lohman Psychological & Quantitative Foundations College of Education University.
Terry Vendlinski Geneva Haertel SRI International
Taxonomies of Learning Foundational Knowledge: Understanding and remembering information and ideas. Application: Skills Critical, creative, and practical.
Theoretical Explanations for the Need to Use NANDA-I, NOC and NIC Margaret Lunney, RN, PhD.
A Framework for Inquiry-Based Instruction through
Some Implications of Expertise Research for Educational Assessment Robert J. Mislevy University of Maryland National Center for Research on Evaluation,
1 Brief Review of Research Model / Hypothesis. 2 Research is Argument.
Johan Brink November 7th Research Methods
Learning Science and Mathematics Concepts, Models, Representations and Talk Colleen Megowan.
Learning Progressions: Some Thoughts About What we do With and About Them Jim Pellegrino University of Illinois at Chicago.
1 Issues in Assessment in Higher Education: Science Higher Education Forum on Scientific Competencies Medellin-Colombia Nov 2-4, 2005 Dr Hans Wagemaker.
Welcome to AP Biology Mr. Levine Ext. # 2317.
Putting Research to Work in K-8 Science Classrooms Ready, Set, SCIENCE.
URBDP 591 I Lecture 3: Research Process Objectives What are the major steps in the research process? What is an operational definition of variables? What.
WHAT IS SCIENCE? WHAT IS SCIENCE? An organized way of gathering and analyzing evidence about the natural world.
LOGIC AND ONTOLOGY Both logic and ontology are important areas of philosophy covering large, diverse, and active research projects. These two areas overlap.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Developing and Evaluating Theories of Behavior.
6. Evaluation of measuring tools: validity Psychometrics. 2012/13. Group A (English)
Measurement Validity.
1 The Theoretical Framework. A theoretical framework is similar to the frame of the house. Just as the foundation supports a house, a theoretical framework.
Question paper 1997.
The Theory of Sampling and Measurement. Sampling First step in implementing any research design is to create a sample. First step in implementing any.
Theories and Hypotheses. Assumptions of science A true physical universe exists Order through cause and effect, the connections can be discovered Knowledge.
The Development and Validation of the Evaluation Involvement Scale for Use in Multi-site Evaluations Stacie A. ToalUniversity of Minnesota Why Validate.
Robert J. Mislevy University of Maryland National Center for Research on Evaluation, Standards, and Student Testing (CRESST) NCME San Diego, CA April 15,
Spring 2015 Kyle Stephenson
NATIONAL CONFERENCE ON STUDENT ASSESSMENT JUNE 22, 2011 ORLANDO, FL.
Reliability performance on language tests is also affected by factors other than communicative language ability. (1) test method facets They are systematic.
Chapter 6 - Standardized Measurement and Assessment
PSY 432: Personality Chapter 1: What is Personality?
2. Main Test Theories: The Classical Test Theory (CTT) Psychometrics. 2011/12. Group A (English)
WHAT MODELS DO THAT THEORIES CAN’T Lilia Gurova Department of Cognitive Science and Psychology New Bulgarian University.
Knowing What Students Know Ganesh Padmanabhan 2/19/2004.
What is a theory? 1. a set of interrelated propositions
پرسشنامه کارگاه.
Measuring Social Life: How Many? How Much? What Type?
Developing and Evaluating Theories of Behavior
Gestalt Theory.
Presentation transcript:

Maryland Validity ConferenceSlide 1October 10, 2008 Validity from the Perspective of Model-Based Reasoning Robert J. Mislevy Measurement, Statistics and Evaluation University of Maryland, College Park Presented at the conference “The Concept of Validity: Revisions, New Directions and Applications,” University of Maryland, College Park, MD October 9-10, Supported by a grant from the Spencer Foundation.

Maryland Validity ConferenceSlide 2October 10, 2008 Overview of the Talk l Sources of unease l Cognition in terms of patterns l Model-based reasoning l Measurement models as model-based reasoning l Implications for validity l Feeling better now

Maryland Validity ConferenceSlide 3October 10, 2008 Sources of Unease (1) Different models fit the same data lTatsuoka (1983) mixed number subtraction

Maryland Validity ConferenceSlide 4October 10, 2008 Sources of Unease (1) Cognitive diagnosis model for instruction Student characterized by vector of 0/1 variables, say , for which operations she had mastered lTask characterized by which ones the task needed lProbability of correct response via latent class model 2PL IRT model for overall proficiency Student characterized by univariate, continuous , for proficiency in the domain lTasks modeled by difficulty & discrimination lProbability of correct response via IRT model Container metaphor Person B Person D Measurement metaphor Item 1Item 4Item 5Item 3Item 6Item 2 Person APerson BPerson D

Maryland Validity ConferenceSlide 5October 10, 2008 Sources of Unease (2) Summary test scores, and factors based on them, have often been though of as “signs” indicating the presence of underlying, latent traits. … An alternative interpretation of test scores as samples of cognitive processes and contents … is equally justifiable and could be theoretically more useful. Snow & Lohman, 1989, p. 317

Maryland Validity ConferenceSlide 6October 10, 2008 Sources of Unease (2) The evidence from cognitive psychology suggests that test performances are comprised of complex assemblies of component information-processing actions that are adapted to task requirements during performance. Snow & Lohman, 1989, p. 317

Maryland Validity ConferenceSlide 7October 10, 2008 Sources of Unease (2) The implication is that sign-trait interpretations of test scores and their intercorrelations are superficial summaries at best. At worst, they have misled scientists, and the public, into thinking of fundamental, fixed entities, measured in amounts. Snow & Lohman, 1989, p. 317

Maryland Validity ConferenceSlide 8October 10, 2008 Sources of Unease (2) Whatever their practical value as summaries, for selection, classification, certification, or program evaluation, the cognitive psychological view is that such interpretations no longer suffice as scientific explanations of aptitude and achievement constructs. Snow & Lohman, 1989, p. 317

Maryland Validity ConferenceSlide 9October 10, 2008 Sources of Unease (3) What is the nature of parameters like  and  ? Where are they? l What is the interpretation of the probabilities that arise from IRT, latent class / cognitive diagnosis models, and the like? l What does this mean about validity of the data / the models / the uses of them?

Maryland Validity ConferenceSlide 10October 10, 2008 Cognition in Terms of Patterns l The sociocognitive paradigm l Metaphors as foundation l Formal model-based reasoning

Maryland Validity ConferenceSlide 11October 10, 2008 The sociocognitive paradigm Converging ideas from cog psych, neurology, anthropology, linguistics, science ed, etc. Knowledge as patterns, at many levels… Assembled to understand, to interact with, and to create particular situations in the world Developed, strengthened, modified by use Associations of all kinds, including applicability, affordances, procedures, strategies, affect

Maryland Validity ConferenceSlide 12October 10, 2008 Walter Kintsch’s CI Theory of Reading Comprehension More focused research areas within cognitive psychology today differ as to their foci, methods, and levels of explanation. They include perception and attention, language and communication, development of expertise, situated and sociocultural psychology, and neurological bases of cognition. TextText baseSituation ModelContext Context 1 LTM Kintsch is focusing here on “experiential” cognition – not conscious, occurring at the scale of milliseconds. We’ll talk about reflective cognition in a couple minutes. Kintsch is focusing here on “experiential” cognition – not conscious, occurring at the scale of milliseconds. We’ll talk about reflective cognition in a couple minutes.

Maryland Validity ConferenceSlide 13October 10, 2008 Walter Kintsch’s CI Theory of Reading Comprehension More focused research areas within cognitive psychology today differ as to their foci, methods, and levels of explanation. They include perception and attention, language and communication, development of expertise, situated and sociocultural psychology, and neurological bases of cognition. TextText baseLTMSituation ModelActionContext Context 1 Context 2

Maryland Validity ConferenceSlide 14October 10, 2008 Walter Kintsch’s CI Theory of Reading Comprehension More focused research areas within cognitive psychology today differ as to their foci, methods, and levels of explanation. They include perception and attention, language and communication, development of expertise, situated and sociocultural psychology, and neurological bases of cognition. TextText baseLTMSituation ModelActionContext Context 1 Context 2

Maryland Validity ConferenceSlide 15October 10, 2008 Walter Kintsch’s CI Theory of Reading Comprehension More focused research areas within cognitive psychology today differ as to their foci, methods, and levels of explanation. They include perception and attention, language and communication, development of expertise, situated and sociocultural psychology, and neurological bases of cognition. TextText baseLTMSituation ModelActionContext Context 2 Context 3

Maryland Validity ConferenceSlide 16October 10, 2008 Metaphors as foundation Lakoff & Johnson »Metaphors we live by (1980); Philosophy in the flesh (1999) Key idea: »Cognitive machinery builds from capabilities for interacting with the real physical and social world. »We extend and creatively recombine basic patterns and relationships to think about everything from … everyday things to extremely complicated and abstract social, conceptual, philosophical realms True of both experiential and reflective cognition.

Maryland Validity ConferenceSlide 17October 10, 2008 Metaphors as foundation Example: Containers Free Clip Art Provided by Artclips.com

Maryland Validity ConferenceSlide 18October 10, 2008 Metaphors as foundation Example: Containers Everyday experience  Set theory »Very good, mostly. Knowledge as collection of discrete things inside our heads »Usually good and useful, in communication »Sometimes inapt, as sole basis of instructional practice and assessment design (the Jeopardy model of cognition—Rosie Perez in White men can’t jump) Example: Containers

Maryland Validity ConferenceSlide 19October 10, 2008 Metaphors as foundation Example: Cause & Effect

Maryland Validity ConferenceSlide 20October 10, 2008 Metaphors as foundation Example: Cause & Effect Newton’s laws; kinematics; quantitative models of force and motion, esp. F=MA

Maryland Validity ConferenceSlide 21October 10, 2008 Metaphors as foundation Example: Cause & Effect  xjxj IRT & SEM models; quantitative models for response probabilities, esp. Rasch’s P= 

Maryland Validity ConferenceSlide 22October 10, 2008 Metaphors as foundation Example: Cause & Effect Everyday experience  F=MA »Very good, mostly. Teleological theories of history, a la Hegel »Not so good, mostly. Example: Cause & Effect Everyday experience  F=MA »Very good, mostly.

Maryland Validity ConferenceSlide 23October 10, 2008 Model-Based Reasoning Real-World Situation Reconceived Real-World Situation Entities and relationships Representational Form A y=ax+b (y-b)/a=x Representational Form B Mappings among representational systems Mainly semantic Mainly syntactic

Maryland Validity ConferenceSlide 24October 10, 2008 Properties of Models (1) Human way to think about complex unique situations Abstract structure of entities, relationships, processes What’s included, what’s omitted Levels of analysis and grainsize »Newtonian and quantum mechanics »Transmission genetics at level of species, individuals, cells, or molecules

Maryland Validity ConferenceSlide 25October 10, 2008 Properties of Models (2) Can apply different models to same situation »Can view selling car to brother-in-law in terms of economic transaction model vs family relationships model Models tuned to uses / problems / purposes »Mixed number subtraction

Maryland Validity ConferenceSlide 26October 10, 2008 Properties of Models (2) The modeling cycle: Evaluate Revise Model Observe Predict/Use »Fit? »Does it work? »What’s left out? »Adequacy of rationale?

Maryland Validity ConferenceSlide 27October 10, 2008 Models with probabilistic layers Probability from analogy with physical games of chance (Shafer) Probability connects to model representation »Key in model criticism Model posits space for patterns; parameter values characterize them; probability models can characterize … »Variation in patterns »Modeler’s uncertainty about patterns & parameters

Maryland Validity ConferenceSlide 28October 10, 2008 Psychometric / Measurement Models E.g., IRT, CTT, FA, SEM, CDM Model posits space for patterns, parameter values characterize them Semantic layer is cause & effect metaphor »Q: In what sense does  “cause” X? »A: The C&E metaphor grounds productive connection between observations and inferences Modeling patterns across people, not explaining item responses (Snow & Lohman) »Could model within-person processes at finer grainsize

Maryland Validity ConferenceSlide 29October 10, 2008 Some answers What is the nature of parameters like  and  ? Where are they? »These are characterizations of patterns we observe in real-world situations (ones we in part construct for target uses) through the lens of a simplified model we are (provisionally) using to think about those situations and the use situations in which the patterns are apt to be relevant. »So they are in our heads, but they aren’t worth much unless they reflect patterns in examinees’ actions in the world.

Maryland Validity ConferenceSlide 30October 10, 2008 Some answers l What is the interpretation of the probabilities that arise from IRT, latent class / cognitive diagnosis models, and the like? »These are characterizations of patterns we observe in situations and our degree of knowledge about them, again through the lens of a simplified model we are (provisionally) using to think about those situations. »In addition to guiding inference through the model, they provide tools for seeing where the model may be misleading, inadequate.

Maryland Validity ConferenceSlide 31October 10, 2008 Some answers l What does this mean about validity of the data / the models / the uses of them?

Maryland Validity ConferenceSlide 32October 10, 2008 Validity Evidence Real-World Situation Reconceived Real-World Situation Entities and relationships Representational Form A y=ax+b (y-b)/a=x Representational Form B Mappings among representational systems Theory and experience supporting the narrative/scientific frame Empirical evaluation of predictions / outcomes Theoretical and empirical grounding of task design Theoretical and empirical grounding of task-scoring procedures

Maryland Validity ConferenceSlide 33October 10, 2008 Validity Implications, Sense 1 The currently dominant view: Validity is an integrated evaluative judgement of the degree to which empirical evidence and theoretical rationales support the adequacy and appropriateness of inferences and actions based on test scores or other modes of assessment. (Messick, 1989) Focus on situated use of data from test Consistent with MBR perspective; i.e., reasoning through psychometric model in particular situations & inferences.

Maryland Validity ConferenceSlide 34October 10, 2008 Validity Implications, Sense 2 Alternative (e.g., Wiley, Borsboom, Lissitz): [A] test is valid for measuring an attribute if and only if (a) the attribute exists and (b) variations in the attribute causally produce variations in the outcomes of the measurement procedure. (Borsboom et al, 2004) MBR view can omit specific uses, but »must consider range of situations and uses that are apt to be thought about effectively via the model. »Broader range consistent with scientific program, in opposition to Snow & Lohman quote. »Is realist but strong correspondence to existence of traits qua traits in individuals is not required.

Maryland Validity ConferenceSlide 35October 10, 2008 I am Feeling Better Now Model-based reasoning provides a way of thinking about validity that … is consistent with the practical methods that have developed to assure quality of inferences from assessments is realist, in constructive-realism and L&J’s “embodied realism” sense is consistent with developments in cognitive psychology, including the nature of scientific reasoning, and the meaning of probability.