Psychometrics to Support RtI Assessment Design Michael C. Rodriguez University of Minnesota February 2010.

Slides:

Advertisements

Similar presentations

Ed-D 420 Inclusion of Exceptional Learners. CAT time Learner-Centered - Learner-centered techniques focus on strategies and approaches to improve learning.

Advertisements

Evaluating and Institutionalizing

Assessing Student Performance

Introduction to IRT/Rasch Measurement with Winsteps Ken Conrad, University of Illinois at Chicago Barth Riley and Michael Dennis, Chestnut Health Systems.

How to Make a Test & Judge its Quality. Aim of the Talk Acquaint teachers with the characteristics of a good and objective test See Item Analysis techniques.

You can use this presentation to: Gain an overall understanding of the purpose of the revised tool Learn about the changes that have been made Find advice.

Chapter 5 Measurement, Reliability and Validity.

Part II Sigma Freud & Descriptive Statistics

AN OVERVIEW OF THE FAMILY OF RASCH MODELS Elena Kardanova

STANDARDIZED TESTING MEASUREMENT AND EVALUATION  All teaching involves evaluation. We compare information to criteria and ten make judgments.  Measurement.

Concept of Measurement

Teaching and Testing Pertemuan 13

Item Response Theory. Shortcomings of Classical True Score Model Sample dependence Limitation to the specific test situation. Dependence on the parallel.

Presented by: Mohsen Saberi and Sadiq Omarmeli  Language testing has improved parallel to advances in technology.  Two basic questions in testing;

Why Scale -- 1 Summarising data –Allows description of developing competence Construct validation –Dealing with many items rotated test forms –check how.

Classroom Assessment A Practical Guide for Educators by Craig A

Copyright © 2005 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Personality Assessment, Measurement, and Research Design.

IEA Teacher Education Study in Mathematics 3 rd NRC Meeting, June 25-29, Taipei, Chinese Taipei. Michael C. Rodriguez University of Minnesota & Michigan.

Item Analysis: Classical and Beyond SCROLLA Symposium Measurement Theory and Item Analysis Modified for EPE/EDP 711 by Kelly Bradley on January 8, 2013.

Multivariate Methods EPSY 5245 Michael C. Rodriguez.

Item Response Theory for Survey Data Analysis EPSY 5245 Michael C. Rodriguez.

Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 14 Measurement and Data Quality.

Unanswered Questions in Typical Literature Review 1. Thoroughness – How thorough was the literature search? – Did it include a computer search and a hand.

Induction to assessing student learning Mr. Howard Sou Session 2 August 2014 Federation for Self-financing Tertiary Education 1.

Classroom Assessments Checklists, Rating Scales, and Rubrics

Classroom Assessment A Practical Guide for Educators by Craig A

The ABC’s of Pattern Scoring Dr. Cornelia Orr. Slide 2 Vocabulary Measurement – Psychometrics is a type of measurement Classical test theory Item Response.

Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.

6. Evaluation of measuring tools: validity Psychometrics. 2012/13. Group A (English)

NORMS. Scores on psychological tests are most commonly interpreted by reference to norms ; which represent the test performance of the standardized sample.

The Next Generation Science Standards: 4. Science and Engineering Practices Professor Michael Wysession Department of Earth and Planetary Sciences Washington.

Classroom Diagnostic Tools. Pre-Formative Assessment of Current CDT Knowledge.

Measurement Validity.

+ ©2014 McGraw-Hill Higher Education. All rights reserved. Chapter 2 Personality Assessment, Measurement, and Research Design.

Assessment and Testing

The ABC’s of Pattern Scoring

JS Mrunalini Lecturer RAKMHSU Data Collection Considerations: Validity, Reliability, Generalizability, and Ethics.

SOCW 671: #5 Measurement Levels, Reliability, Validity, & Classic Measurement Theory.

+ Chapter Scientific Method variable is the factor that changes in an experiment in order to test a hypothesis. To test for one variable, scientists.

Criterion-Referenced Testing and Curriculum-Based Assessment EDPI 344.

Reliability performance on language tests is also affected by factors other than communicative language ability. (1) test method facets They are systematic.

SECOND EDITION Chapter 5 Standardized Measurement and Assessment

Item Analysis: Classical and Beyond SCROLLA Symposium Measurement Theory and Item Analysis Heriot Watt University 12th February 2003.

Chapter 6 - Standardized Measurement and Assessment

PSYCHOMETRICS. SPHS 5780, LECTURE 6: PSYCHOMETRICS, “STANDARDIZED ASSESSMENT”, NORM-REFERENCED TESTING.

Classroom Assessment Chapters 4 and 5 ELED 4050 Summer 2007.

Essentials for Measurement. Basic requirements for measuring 1) The reduction of experience to a one dimensional abstraction. 2) More or less comparisons.

TEST SCORES INTERPRETATION - is a process of assigning meaning and usefulness to the scores obtained from classroom test. - This is necessary because.

STANDARDIZED TESTING Understanding the Vocabulary.

© 2009 Pearson Prentice Hall, Salkind. Chapter 5 Measurement, Reliability and Validity.

Classroom Assessments Checklists, Rating Scales, and Rubrics

Personality Assessment, Measurement, and Research Design

Classroom Assessment A Practical Guide for Educators by Craig A

VALIDITY by Barli Tambunan/

Reliability and Validity in Research

Assessment Theory and Models Part II

Introduction to the Validation Phase

Introduction to the Validation Phase

Item Analysis: Classical and Beyond

Classroom Assessments Checklists, Rating Scales, and Rubrics

Week 3 Class Discussion.

Reliability and Validity of Measurement

Confidence in Competence

Standards and Assessment Alternatives

EPSY 5245 EPSY 5245 Michael C. Rodriguez

Frameworks for Considering Item Response Demands

Personality Assessment, Measurement, and Research Design

Item Analysis: Classical and Beyond

Item Analysis: Classical and Beyond

Presentation transcript:

Psychometrics to Support RtI Assessment Design Michael C. Rodriguez University of Minnesota February 2010

The Rasch Approach Construct Definition – A simple form: More or less, high to low Item Development – Realizations of the construct Outcome Space – Aspect of response we value – how to score Measurement Model – How we related scores to constructs

From Construct to Item Responses Construct Item Responses Outcome Space Measurement Model Causality Inferences Source: Mark Wilson, 2005

Measurement Models: IRT v. Rasch Most IRT models are based on a paradigm that identifies a model which explains variation in the data – to find a model that best characterizes the data Rasch is an approach that is based on the paradigm of constructing a measure which can characterize a construct on a linear scale – such that the total score fully characterizes a person on a given construct

Rasch Philosophy Rasch models provide a basis and justification for obtaining person locations on a continuum from total scores on assessments. Although it is not uncommon to treat total scores directly as measurements, they are actually counts of discrete observations rather than measurements.

Each observation represents the observable outcome of a comparison between a person and item. Such outcomes are directly analogous to the observation of the rotation of a balance scale in one direction or another. This observation would indicate that one or other object has a greater mass, but counts of such observations cannot be treated directly as measurements.

Item Characteristic Curve

Test Characteristic Curve

4 Points on the Raw Score Scale

Test Characteristic Curve 0.5 on the Rasch Scale 4 Points on the Raw Score Scale

Test Characteristic Curve 4 Points on the Raw Score Scale 1.0 Point on the Rasch Scale

From Numbers to Meaning Numbers themselves do not mean much. – Is 10 meters a short distance? Long distance? We need context to bring meaning to the measure: 10 meters. However, 10 meters should always be 10 meters, no matter who takes the measure or how it is taken.

Sample Dependent Statistics Is an item with a p-value of.90 easy or difficult? … 90% passed the item Is a person with a score of 5 out of 50 items low in ability? … correctly answered 10% of the items

Rasch Scaling Person-free item difficult –L–Locates the items on the ability continuum Item-free person ability –L–Locates the person on the ability continuum Places items and persons on the same scale – the ITEM MAP

Item Map

Construct MAP 1. Explains the construct; interpretation guide 2. Enables design of items that will lead children to give responses that inform important levels of the construct map; identify relevant item features 3. Provides criterion to analyze responses regarding degree of consistency with construct map 4. Item selection or retention should be based on informed professional judgment

Construct Map Describing Task Characteristics

Immediate Benefits to Support IGDIs From limitations of relying on card difficulty – To locating cards along the trait continuum From limitations of card discrimination – To examining card-trait level correlations From ad-hoc decision making – To coherence in item design, item selection, scoring, analysis, interpretation, and decision making From a norm-referenced interpretation – To a criterion-referenced framework