Copyright © 2006 Educational Testing Service Listening. Learning. Leading. Using Differential Item Functioning to Investigate the Impact of Accommodations.

Slides:



Advertisements
Similar presentations
Designing Accessible Reading Assessments National Accessible Reading Assessment Projects General Advisory Committee December 7, 2007 Overview of DARA Project.
Advertisements

Copyright © 2004 Educational Testing Service Listening. Learning. Leading. Using Differential Item Functioning to Analyze a State English-language Arts.
Technology Assisted Reading Assessment National Accessible Reading Assessment Projects General Advisory Committee December 7, 2007 Overview of TARA project.
Impact of Read Aloud on Test of Reading Comprehension Cara Cahalan-Laitusis Educational Testing Service.
Copyright © 2004 Educational Testing Service Listening. Learning. Leading. Using DIF to Examine the Validity and Fairness of Assessments for Students With.
Examining Differential Boost from Read Aloud on a Test of Reading Comprehension at Grades 4 and 8 Cara Cahalan-Laitusis, Linda Cook, Fred Cline, and Teresa.
National Accessible Reading Assessment Projects DARA Research and Next Steps Cara Cahalan-Laitusis & Linda Cook Educational Testing Service.
Listening. Learning. Leading. Using Factor Analysis to Investigate the Impact of Accommodations on the Scores of Students with Disabilities on English-Language.
Examining Differential Boost from Read Aloud on a Test of Reading Comprehension at Grades 4 and 8 Cara Cahalan-Laitusis Linda Cook Fred Cline Teresa King.
Designing Accessible Reading Assessments Examining Test Items for Differential Distractor Functioning Among Students with Learning Disabilities Kyndra.
National Accessible Reading Assessment Projects National Accessible Reading Assessment Projects General Advisory Committee December 8, 2006 Overview of.
1 Matching EPIET introductory course Mahón, 2011.
1 What Is The Next Step? - A review of the alignment results Liru Zhang, Katia Forêt & Darlene Bolig Delaware Department of Education 2004 CCSSO Large-Scale.
Determine Eligibility Chapter 4. Determine Eligibility 4-2 Objectives Search for Customer on database Enter application signed date and eligibility determination.
Inclusions Regulation Testing Situations Decide if the situation is a violation or not a violation. Cite the page and paragraph number from the document(s)
1 Adequate Yearly Progress 2005 Status Report Research, Assessment & Accountability November 2, 2005 Oakland Unified School District.
A presentation to the Board of Education
1 R-2 Report: Read and write at the end of third grade Review of Progress and Approval of Targets A presentation to the Board by Vince.
Module 2 Sessions 10 & 11 Report Writing.
Experimental and Quasiexperimental Designs Chapter 10 Copyright © 2009 Elsevier Canada, a division of Reed Elsevier Canada, Ltd.
1 New York State English as a Second Language Achievement Test (NYSESLAT) Presented by: Vanessa Lee Mercado Assistant in Educational Testing Office of.
Selecting and Assigning Accessibility Features and Accommodated Test Forms in PearsonAccess 1 Accessibility Features and Accommodations.
Improving Practitioner Assessment Participation Decisions for English Language Learners with Disabilities Laurene Christensen, Ph.D. Linda Goldstone, M.S.
1 Developing Tests for Departmental Assessment Deborah Moore, Assessment Specialist Institutional Research, Planning, & Effectiveness University of Kentucky.
The effect of differential item functioning in anchor items on population invariance of equating Anne Corinne Huggins University of Florida.
What’s New with PARCC for ELA? January 30, 2014 Vincent Segalini.
Copyright © 2014 by Educational Testing Service. ETS, the ETS logo, LISTENING. LEARNING. LEADING. and GRE are registered trademarks of Educational Testing.
VB-MAPP Verbal Behavior Milestones Assessment and Placement Program
25 seconds left…...
The Careers Powered By English series English Interview Skills Session 7 of 9 By Lado Management Consultants Adrian O’Donnell.
Holistic Rating Training Requirements Texas Education Agency Student Assessment Division.
Holistic Rating Training Requirements Texas Education Agency Student Assessment Division.
Chapter 11: The t Test for Two Related Samples
Multiple Regression and Model Building
Understanding Common Concerns about the Focus School Metric August
4/4/2015Slide 1 SOLVING THE PROBLEM A one-sample t-test of a population mean requires that the variable be quantitative. A one-sample test of a population.
Connecting the Process to: -Current Practice -CEP -CIITS/EDS 1.
What is a CAT?. Introduction COMPUTER ADAPTIVE TEST + performance task.
DIF Analysis Galina Larina of March, 2012 University of Ostrava.
Designing Accessible Reading Assessments Reading Aloud Tests of Reading Review of Research from the Designing Accessible Reading Assessments Projects Cara.
Issues Related to Assessment with Diverse Populations
Confidential and Proprietary. Copyright © 2010 Educational Testing Service. All rights reserved. Catherine Trapani Educational Testing Service ECOLT: October.
National Center on Educational Outcomes NCEO Pre-conference Clinic Under the Big Top! Accommodating Assessments for ALL Students.
© UCLES 2013 Assessing the Fit of IRT Models in Language Testing Muhammad Naveed Khalid Ardeshir Geranpayeh.
Designing Accessible Reading Assessments Research on Making Large Scale Assessments More Accessible for Students with Disabilities Institute of Education.
Copyright © 2006 Educational Testing Service Listening. Learning. Leading. 1 College Admissions Testing: Performance, Validity and Use Cara Cahalan-Laitusis.
1 The New York State Education Department New York State’s Student Reporting and Accountability System.
Psychometric Issues in the Use of Testing Accommodations Chapter 4 David Goh.
1 New York State Growth Model for Educator Evaluation 2011–12 July 2012 PRESENTATION as of 7/9/12.
Cara Cahalan-Laitusis Operational Data or Experimental Design? A Variety of Approaches to Examining the Validity of Test Accommodations.
1 Bias and Sensitivity Review of Items for the MSP/HSPE/EOC August, 2012 ETS Olympia 1.
Rasch trees: A new method for detecting differential item functioning in the Rasch model Carolin Strobl Julia Kopf Achim Zeileis.
Students with Learning Disabilities Assessment. Purposes of Assessment Screening Determining eligibility Planning a program Monitoring student progress.
Evaluating Impacts of MSP Grants Hilary Rhodes, PhD Ellen Bobronnikov February 22, 2010 Common Issues and Recommendations.
The Impact of Missing Data on the Detection of Nonuniform Differential Item Functioning W. Holmes Finch.
Evaluating Impacts of MSP Grants Ellen Bobronnikov Hilary Rhodes January 11, 2010 Common Issues and Recommendations.
School-Wide Rubrics An Overview. Our Expectations NEASC required for accreditation Developed by a 20+ member leadership team with representation of many.
Evaluating Impacts of MSP Grants Ellen Bobronnikov January 6, 2009 Common Issues and Potential Solutions.
Nurhayati, M.Pd Indraprasta University Jakarta.  Validity : Does it measure what it is supposed to measure?  Reliability: How the representative is.
No Child Left Behind Impact on Gwinnett County Public Schools’ Students and Schools.
Grades 3-8 Assessment Results. English Language Arts.
Ensuring Consistency in Assessment of Continuing Care Needs: An Application of Differential Item Functioning Analysis R. Prosser, M. Gelin, D. Papineau,
C R E S S T / CU University of Colorado at Boulder National Center for Research on Evaluation, Standards, and Student Testing Design Principles for Assessment.
Accommodations and Modification in Grades Do NOT fundamentally alter or lower expectations or standards in instructional level, content, or performance.
Chapter 2 The Assessment Process. Two Types of Decisions Legal Decisions The student is determined to have a disability. The disability has an adverse.
The Reliability of Crowdsourcing: Latent Trait Modeling with Mechanical Turk Matt Baucum, Steven V. Rouse, Cindy Miller-Perrin, Elizabeth Mancuso Pepperdine.
SAT and Accountability Evidence and Information Needed and Provided for Using Nationally Recognized High School Assessments for ESSA Kevin Sweeney,
AWG Spoke Committee- English Learner Subgroup
Deputy Commissioner Jeff Wulfson Associate Commissioner Michol Stapel
Presentation transcript:

Copyright © 2006 Educational Testing Service Listening. Learning. Leading. Using Differential Item Functioning to Investigate the Impact of Accommodations on the Scores of Students with Disabilities on English-Language Arts Assessments Mary Pitoniak, Linda Cook, Frederic Cline, and Cara Cahalan-Laitusis Educational Testing Service NCME Presentation April 10, 2006

Copyright © 2006 Educational Testing Service 2 P 2 Purpose and Overview of the Study The purpose of this study was to examine differential item functioning on the English- Language Arts assessment described by Linda DIF analyses are statistical procedures that are used to identify items that function differently for different subgroups of examinees DIF exists when examinees of equal ability differ, on average, according to their group membership in their responses to a particular item (Standards)

Copyright © 2006 Educational Testing Service 3 P 3 Purpose and Overview of the Study (continued) Issues investigated: – Do 2 different DIF detection methods yield the same results? – Are the results interpretable in terms of a priori or a posteriori evaluation of item content? – Of particular interest: When the read-aloud modification is used, do the items function differentially for students?

Copyright © 2006 Educational Testing Service 4 P 4 Purpose and Overview of the Study (continued) Features of study: – 2 DIF detection methods – Large enough sample sizes (not always the case) – Looked at 3 different criteria (total score, Reading score, Writing score); we decided to go with total score for several reasons – Used purification step, as recommended by literature

Copyright © 2006 Educational Testing Service 5 P 5 Comparisons Made in the Study

Copyright © 2006 Educational Testing Service 6 P 6 DIF Methods Used Mantel-Haenszel Logistic Regression For both methods, we used ETS classification system: – Category A contains items with negligible DIF; – Category B contains items with slight to moderate values of DIF; – Category C contains items with moderate to large values of DIF.

Comparison of Mantel-Haenszel vs. Logistic Regression

Example of Uniform DIF

Example of Non-Uniform DIF

Copyright © 2006 Educational Testing Service 10 P 10 Results Within this presentation, I will present results only for Reading items (and not Writing), both for time reasons and because we were most interested in the effects of the accommodations on performance on the Reading items

Copyright © 2006 Educational Testing Service 11 P 11 Results (continued) Overall – No items flagged as C – Each method flagged 9 items as B (out of 42 items X 5 comparisons, or 210 possible flags) – However, those 9 items were not the same itemsin all, 12 different items were flagged by at least one of the methods – There were inconsistencies between methods

Copyright © 2006 Educational Testing Service 12 P 12 Number of Items Flagged by Each Method

Agreement Between Flags for Methods by Comparison Type

Non-LD vs. LD No Accommodation

Non-LD vs. LD IEP/504 Accommodation

Non-LD vs. LD Read-Aloud Modification

LD Non-Accommodated vs. LD IEP/504 Accommodation

LD Non-Accommodated vs. LD Read-Aloud Modification

Example of Discrepancies in Flags Item Flags M-HUniform LRNo flag The items flagged by MH (but not LR) as favoring students with read-aloud modification did show differences such as these graphically for LR

Example of Discrepancies in Flags Item Flags M-HUniform LRNo flag

Copyright © 2006 Educational Testing Service 21 P 21 A Priori Theories About Read-Aloud Modification Results 5 items were easier for students who received the read-aloud modification than for non-LD students. A priori theories were not that accurate! – Item A: harder (refer back) – Item B: easier (short item; intonation/body language) – Item C: easier (intonation/body language) – Item D: harder (char. of options) – Item E: harder (length of options)

Copyright © 2006 Educational Testing Service 22 P 22 A Posteriori Interpretation About Read-Aloud Modification Results The reasons why these 5 items were easier with read-aloud accommodation were not obvious to test developers

Copyright © 2006 Educational Testing Service 23 P 23 What Do the Results Say About the 3 Questions Posed Do 2 different DIF detection methods yield the same results? – Neither flagged an item as C. – There were discrepancies in B flags, however. – Some discrepancies are explainable in terms of advantages/disadvantages of methods as listed earlier.

Copyright © 2006 Educational Testing Service 24 P 24 Are the results interpretable in terms of a priori or a posteriori evaluation of item content? – Not consistently Of particular interest: When the read-aloud modification is used, do the items function differentially for students? – Yes, some items were easier when read-aloud, which supports this states decision to view read-aloud as a modification 3 Questions (continued)

Copyright © 2006 Educational Testing Service 25 P 25 ELL and ELL/LD groups to be compared Grade 8 ELA to be evaluated DIF analysis paradigm to be utilized Next Steps