The ABC’s of Pattern Scoring

Slides:



Advertisements
Similar presentations
Psychometrics to Support RtI Assessment Design Michael C. Rodriguez University of Minnesota February 2010.
Advertisements

Test Development.
Advanced Methods and Analysis for the Learning and Social Sciences PSY505 Spring term, 2012 January 23, 2012.
Item Response Theory in a Multi-level Framework Saralyn Miller Meg Oliphint EDU 7309.
1 Scaling of the Cognitive Data and Use of Student Performance Estimates Guide to the PISA Data Analysis ManualPISA Data Analysis Manual.
Item Response Theory in Health Measurement
Introduction to Item Response Theory
AN OVERVIEW OF THE FAMILY OF RASCH MODELS Elena Kardanova
Models for Measuring. What do the models have in common? They are all cases of a general model. How are people responding? What are your intentions in.
Item Response Theory. Shortcomings of Classical True Score Model Sample dependence Limitation to the specific test situation. Dependence on the parallel.
Chapter 9 Flashcards. measurement method that uses uniform procedures to collect, score, interpret, and report numerical results; usually has norms and.
Standardized Test Scores Common Representations for Parents and Students.
Classical Test Theory By ____________________. What is CCT?
Item Analysis: Classical and Beyond SCROLLA Symposium Measurement Theory and Item Analysis Modified for EPE/EDP 711 by Kelly Bradley on January 8, 2013.
Identification of Misfit Item Using IRT Models Dr Muhammad Naveed Khalid.
Item Response Theory Psych 818 DeShon. IRT ● Typically used for 0,1 data (yes, no; correct, incorrect) – Set of probabilistic models that… – Describes.
Item Response Theory. What’s wrong with the old approach? Classical test theory –Sample dependent –Parallel test form issue Comparing examinee scores.
Measurement 102 Steven Viger Lead Psychometrician
1 Item Analysis - Outline 1. Types of test items A. Selected response items B. Constructed response items 2. Parts of test items 3. Guidelines for writing.
Introduction to plausible values National Research Coordinators Meeting Madrid, February 2010.
Modern Test Theory Item Response Theory (IRT). Limitations of classical test theory An examinee’s ability is defined in terms of a particular test The.
Technical Adequacy Session One Part Three.
Out with the Old, In with the New: NYS Assessments “Primer” Basics to Keep in Mind & Strategies to Enhance Student Achievement Maria Fallacaro, MORIC
UNIT IV ITEM ANALYSIS IN TEST DEVELOPMENT CHAP 14: ITEM ANALYSIS CHAP 15: INTRODUCTION TO ITEM RESPONSE THEORY CHAP 16: DETECTING ITEM BIAS 1.
Test item analysis: When are statistics a good thing? Andrew Martin Purdue Pesticide Programs.
The ABC’s of Pattern Scoring Dr. Cornelia Orr. Slide 2 Vocabulary Measurement – Psychometrics is a type of measurement Classical test theory Item Response.
Measuring Mathematical Knowledge for Teaching: Measurement and Modeling Issues in Constructing and Using Teacher Assessments DeAnn Huinker, Daniel A. Sass,
智慧型系統實驗室 iLab 南台資訊工程 1 Evaluation for the Test Quality of Dynamic Question Generation by Particle Swarm Optimization for Adaptive Testing Department of.
IRT Model Misspecification and Metric Consequences Sora Lee Sien Deng Daniel Bolt Dept of Educational Psychology University of Wisconsin, Madison.
Measuring Human Intelligence with Artificial Intelligence Adaptive Item Generation Sangyoon Yi Susan E. Embretson.
Educational Psychology: Theory and Practice Chapter 14 Standardized Tests This multimedia product and its contents are protected under copyright law. The.
Test Scaling and Value-Added Measurement Dale Ballou Vanderbilt University April, 2008.
NORMS. Scores on psychological tests are most commonly interpreted by reference to norms ; which represent the test performance of the standardized sample.
MEASUREMENT: SCALE DEVELOPMENT Lu Ann Aday, Ph.D. The University of Texas School of Public Health.
University of Georgia – Chemistry Department JExam - A Method to Measure Outcomes Assessment Charles H. Atwood, Kimberly D. Schurmeier, and Carrie G. Shepler.
A COMPARISON METHOD OF EQUATING CLASSIC AND ITEM RESPONSE THEORY (IRT): A CASE OF IRANIAN STUDY IN THE UNIVERSITY ENTRANCE EXAM Ali Moghadamzadeh, Keyvan.
1 Item Analysis - Outline 1. Types of test items A. Selected response items B. Constructed response items 2. Parts of test items 3. Guidelines for writing.
Differential Item Functioning. Anatomy of the name DIFFERENTIAL –Differential Calculus? –Comparing two groups ITEM –Focus on ONE item at a time –Not the.
Scaling and Equating Joe Willhoft Assistant Superintendent of Assessment and Student Information Yoonsun Lee Director of Assessment and Psychometrics Office.
Validity and Item Analysis Chapter 4. Validity Concerns what the instrument measures and how well it does that task Not something an instrument has or.
Validity and Item Analysis Chapter 4.  Concerns what instrument measures and how well it does so  Not something instrument “has” or “does not have”
University of Ostrava Czech republic 26-31, March, 2012.
Multitrait Scaling and IRT: Part I Ron D. Hays, Ph.D. Questionnaire Design and Testing.
Estimation. The Model Probability The Model for N Items — 1 The vector probability takes this form if we assume independence.
Item Factor Analysis Item Response Theory Beaujean Chapter 6.
Reliability performance on language tests is also affected by factors other than communicative language ability. (1) test method facets They are systematic.
Latent regression models. Where does the probability come from? Why isn’t the model deterministic. Each item tests something unique – We are interested.
Item Response Theory in Health Measurement
Item Parameter Estimation: Does WinBUGS Do Better Than BILOG-MG?
Item Analysis: Classical and Beyond SCROLLA Symposium Measurement Theory and Item Analysis Heriot Watt University 12th February 2003.
The Design of Statistical Specifications for a Test Mark D. Reckase Michigan State University.
2. Main Test Theories: The Classical Test Theory (CTT) Psychometrics. 2011/12. Group A (English)
Item Response Theory Dan Mungas, Ph.D. Department of Neurology University of California, Davis.
Multitrait Scaling and IRT: Part I Ron D. Hays, Ph.D. Questionnaire.
Item Response Theory and Computerized Adaptive Testing Hands-on Workshop, day 2 John Rust, Iva Cek,
Lesson 2 Main Test Theories: The Classical Test Theory (CTT)
Chapter 2 Norms and Reliability. The essential objective of test standardization is to determine the distribution of raw scores in the norm group so that.
IRT Equating Kolen & Brennan, 2004 & 2014 EPSY
The Impact of Item Response Theory in Educational Assessment: A Practical Point of View Cees A.W. Glas University of Twente, The Netherlands University.
Using Item Response Theory to Track Longitudinal Course Changes
The Impact of Item Response Theory in Educational Assessment: A Practical Point of View Cees A.W. Glas University of Twente, The Netherlands University.
Item Analysis: Classical and Beyond
MANA 4328 Dennis C. Veit Measurement MANA 4328 Dennis C. Veit 1.
MANA 4328 Dennis C. Veit Measurement MANA 4328 Dennis C. Veit 1.
By ____________________
Mohamed Dirir, Norma Sinclair, and Erin Strauts
Item Analysis: Classical and Beyond
Evaluating Multi-item Scales
Multitrait Scaling and IRT: Part I
Item Analysis: Classical and Beyond
Presentation transcript:

The ABC’s of Pattern Scoring Dr. Cornelia Orr

Vocabulary Measurement – Psychometrics is a specialized application Classical test theory Item Response Theory – IRT (AKA logistic trait theory) 1 – 2 – 3 parameter IRT models Pattern Scoring

General & Specialized Measurement Assigning numbers to objects or events Ex. – time, height, earthquakes, hurricanes, stock market Psychometrics Assigning numbers to psychological characteristics Ex. – personality, IQ, opinion, interests, knowledge

Different Theories of Psychometrics Classical Test Theory Item discrimination values Item difficulty values (p-values) Guessing (penalty) Number correct scoring Item Response Theory Item discrimination values Item difficulty values Guessing (pseudo-guessing) values Pattern scoring Similar constructs – Different derivations

Different Methods of Scoring Number-Correct Scoring Simple Mathematics Raw scores (# of points) Mean, SD, SEM, % correct Number right scale Score conversions Scale scores, percentile ranks, etc. Pattern Scoring Complex Mathematics Maximum likelihood estimates Item statistics, student’s answer pattern, SEM Theta scale (mean=0, standard dev=1) Score conversions Scale scores, percentile ranks, etc.

Comparison: Number Correct and Pattern Scoring Similarities The relationship of derived scores is the same For example, a scale score obtained in a test corresponds to the same percentile for both methods. Differences Methods of deriving scores The number of scale scores possible Number right = limited to the number of items IRT = unlimited or limited by the scale (ex. 100-500)

Choosing the Scoring Method Which model? Simple vs. Complex? Best estimates? Advantages/Disadvantages? Ex. – Why do the same number correct get different scale scores? Ex. – Flat screen TV – how do they do that?

Disadvantages of IRT and Pattern Scoring Complex Mathematics – Technical Difficult to explain Difficult to understand It doesn’t add up! Perceived as Hocus Pocus

Advantages of IRT and Pattern Scoring Better estimates of an examinee’s ability the score that is most likely, given the student’s responses to the questions on the test (maximum likelihood scoring) More information about students and items are used More reliability than number right scoring Less measurement error (SEM)

Item Characteristic Curve (ICC)

Examples 4 examinees’ response patterns (1=correct) 5 Items (Effects of Item Discrimination) No Type a b c 1 MC 0.0250 300.000 0.2 2 MC 0.0200 300.000 0.2 3 MC 0.0150 300.000 0.2 4 MC 0.0100 300.000 0.2 5 MC 0.0050 300.000 0.2 4 examinees’ response patterns (1=correct) Pattern SEM SS 12345 11100 39 300 01110 46 278 00111 61 258 10011 94 260

Examples 4 examinees’ response patterns (1=correct) 5 items (Effects of item difficulty) No Type a b c 1 1 MC 0.0150 250.000 0.1 2 1 MC 0.0150 275.000 0.1 3 1 MC 0.0150 300.000 0.1 4 1 MC 0.0150 325.000 0.1 5 1 MC 0.0150 350.000 0.1 4 examinees’ response patterns (1=correct) Pattern SEM SS 12345 11100 43 300 01110 43 305 00111 43 299 43 310 Missing easy items can result in a lower scores.