Jamal Abedi National Center for Research on Evaluation, Standards, and Student Testing UCLA Graduate School of Education & Information Studies November.

Slides:



Advertisements
Similar presentations
Ed-D 420 Inclusion of Exceptional Learners. CAT time Learner-Centered - Learner-centered techniques focus on strategies and approaches to improve learning.
Advertisements

The Teacher Work Sample
Item Analysis.
Fairness in Testing: Introduction Suzanne Lane University of Pittsburgh Member, Management Committee for the JC on Revision of the 1999 Testing Standards.
Mark D. Reckase Michigan State University The Evaluation of Teachers and Schools Using the Educator Response Function (ERF)
In Today’s Society Education = Testing Scores = Accountability Obviously, Students are held accountable, But also!  Teachers  School districts  States.
Chapter Fifteen Understanding and Using Standardized Tests.
ELL-Language-based Accommodations for Content Area Assessments The University of Central Florida Cocoa Campus Jamal Abedi University of California, Davis.
Issues of Technical Adequacy in Measuring Student Growth for Educator Effectiveness Stanley Rabinowitz, Ph.D. Director, Assessment & Standards Development.
 Dianna vs. the Board of Education 1970 Filed on behalf of 9 students who were Mexican American Placed in classes for special education after a test.
C R E S S T / U C L A Issues and problems in classification of students with limited English proficiency Jamal Abedi UCLA Graduate School of Education.
Using Growth Models for Accountability Pete Goldschmidt, Ph.D. Assistant Professor California State University Northridge Senior Researcher National Center.
Are Accommodations Used for ELL Students Valid? Jamal Abedi University of California, Davis National Center for Research on Evaluation, Standards and Student.
National Center on Educational Outcomes What Item Level Data Tell Us About Universal Design: Fantasy, Foolishness, or Fuel for Fire? in Large- Scale Assessments.
Uses of Language Tests.
Jamal Abedi University of California, Davis/CRESST Presented at The Race to the Top Assessment Program January 20, 2010 Washington, DC RACE TO THE TOP.
Meeting NCLB Act: Students with Disabilities Who Are Caught in the Gap Martha Thurlow Ross Moen Jane Minnema National Center on Educational Outcomes
Minnesota Manual of Accommodations for Students with Disabilities Training Guide
Jamal Abedi University of California, Davis/CRESST Presented at: The Race to the Top Assessment Program Public & Expert Input Meeting December 2, 2009.
Catherine Cross Maple, Ph.D. Deputy Secretary Learning and Accountability
Evaluating a Norm-Referenced Test Dr. Julie Esparza Brown SPED 510: Assessment Portland State University.
Jamal Abedi University of California, Davis/CRESST Validity, Effectiveness and Feasibility of Accommodations for English Language Learners With Disabilities.
Creating Assessments with English Language Learners in Mind In this module we will examine: Who are English Language Learners (ELL) and how are they identified?
An Update on Florida’s Charter Schools Program Grant: CAPES External Evaluation 2014 Florida Charter Schools Conference: Sharing Responsibility November.
The University of Central Florida Cocoa Campus
Identifying the gaps in state assessment systems CCSSO Large-Scale Assessment Conference Nashville June 19, 2007 Sue Bechard Office of Inclusive Educational.
Assessment and Accountability Issues for English Language Learners and Students With Disabilities Oregon Department of Education October 4, 2007 Jamal.
Texas Comprehensive SEDL Austin, Texas March 16–17, 2009 Making Consistent Decisions About Accommodations for English Language Learners – Research.
CRESST ONR/NETC Meetings, July 2003, v1 1 ONR Advanced Distributed Learning Language Factors in the Assessment of English Language Learners Jamal.
C R E S S T / U C L A Impact of Linguistic Factors in Content-Based Assessment for ELL Students Jamal Abedi UCLA Graduate School of Education & Information.
Instruction, Teacher Evaluation and Value-Added Student Learning Minneapolis Public Schools November,
Bilingual Students and the Law n Title VI of the Civil Rights Act of 1964 n Title VII of the Elementary and Secondary Education Act - The Bilingual Education.
Measuring Complex Achievement
CRESST ONR/NETC Meetings, July 2003, v1 ONR Advanced Distributed Learning Linguistic Modification of Test Items Jamal Abedi University of California,
1/27 CRESST/UCLA Research findings on the impact of language factors on the assessment and instruction of English language Learners Jamal Abedi University.
1 Watertown Public Schools Assessment Reports 2010 Ann Koufman-Frederick and Administrative Council School Committee Meetings Oct, Nov, Dec, 2010 Part.
STUDENT AIMS PERFORMANCE IN A PREDOMINANTLY HISPANIC DISTRICT Lance Chebultz Arizona State University 2012.
MODULE 3 – Topic 305 Toolkit for Learners who are Culturally and Linguistically Diverse Module 3: Assessing and Monitoring Student Progress Culturally.
CALIFORNIA DEPARTMENT OF EDUCATION Jack O’Connell, State Superintendent of Public Instruction Results of the 2005 National Assessment of Educational Progress.
WKCE Translation Accommodation Annual Bilingual/ESL Meeting October 8, 2009.
Michigan Educational Assessment Program MEAP. Fall Purpose The Michigan Educational Assessment Program (MEAP) is Michigan’s general assessment.
CRESST ONR/NETC Meetings, July 2003, v1 ONR Advanced Distributed Learning Impact of Language Factors on the Reliability and Validity of Assessment.
C R E S S T / U C L A UCLA Graduate School of Education & Information Studies Center for the Study of Evaluation National Center for Research on Evaluation,
Question paper 1997.
Assessment Information from multiple sources that describes a student’s level of achievement Used to make educational decisions about students Gives feedback.
Michigan School Report Card Update Michigan Department of Education.
State Practices for Ensuring Meaningful ELL Participation in State Content Assessments Charlene Rivera and Lynn Shafer Willner GW-CEEE National Conference.
Alternative Assessment Chapter 8 David Goh. Factors Increasing Awareness and Development of Alternative Assessment Educational reform movement Goals 2000,
Assessment Procedures for Counselors and Helping Professionals, 7e © 2010 Pearson Education, Inc. All rights reserved. English Language Learners Assessing.
Jamal Abedi CRESST/University of California,Los Angeles Paper presented at 34 th Annual Conference on Large-Scale Assessment Boston, June 20-23, 2004.
Massachusetts Comprehensive Assessment System (MCAS) /22/2010.
EVAAS and Expectations. Answers the question of how effective a schooling experience is for learners Produces reports that –Predict student success –Show.
C R E S S T / U C L A Psychometric Issues in the Assessment of English Language Learners Presented at the: CRESST 2002 Annual Conference Research Goes.
1 Accountability Systems.  Do RFEPs count in the EL subgroup for API?  How many “points” is a proficient score worth?  Does a passing score on the.
No Child Left Behind Impact on Gwinnett County Public Schools’ Students and Schools.
Chapter 3 Selection of Assessment Tools. Council of Exceptional Children’s Professional Standards All special educators should possess a common core of.
Jamal Abedi, UCLA/CRESST Major psychometric issues Research design issues How to address these issues Universal Design for Assessment: Theoretical Foundation.
Critical Issues Related to ELL Accommodations Designed for Content Area Assessments The University of Central Florida Cocoa Campus Jamal Abedi University.
C R E S S T / CU University of Colorado at Boulder National Center for Research on Evaluation, Standards, and Student Testing Measuring Adequate Yearly.
C R E S S T / U C L A UCLA Graduate School of Education & Information Studies Center for the Study of Evaluation National Center for Research on Evaluation,
Proposed End-of-Course (EOC) Cut Scores for the Spring 2015 Test Administration Presentation to the Nevada State Board of Education March 17, 2016.
How to avoid misclassification of English Language Learners with Disabilities Presented at: Supporting English Language Learners with Disability Symposium.
NCLB Assessment and Accountability Provisions: Issues for English-language Learners Diane August Center for Applied Linguistics.
Examples: Avoid Using Synonyms in Problems An issue that can create difficulties is to use a synonym for a word somewhere in the problem. Consider the.
Assessing ELLs with STAAR TM Texas Assessment Conference December 6, 2011  Silvia Alvarado-Bolek, Manager, STAAR Spanish Megan Galicia, Manager, Language.
BY MADELINE GELMETTI INCLUDING STUDENTS WITH DISABILITIES AND ENGLISH LEARNERS IN MEASURES OF EDUCATOR EFFECTIVENESS.
ELL-Focused Accommodations for Content Area Assessments: An Introduction The University of Central Florida Cocoa Campus Jamal Abedi University of California,
Common Core Update May 15, 2013.
What do ELLs Need Differently in Math
Understanding and Using Standardized Tests
Presentation transcript:

Jamal Abedi National Center for Research on Evaluation, Standards, and Student Testing UCLA Graduate School of Education & Information Studies November 18, 2004 Psychometric Issues in the ELL Assessment and Special Education Eligibility English Language Learners Struggling to Learn: Emergent Research on Linguistic Differences and Learning Disabilities

Why Should English Language Learners be Assessed? Goals 2000 Title I and VII of the Improving America’s School Act of 1994 (IASA) No Child Left Behind Act

Should Schools Test English Language Learners?  Yes Assessment outcomes may not be valid because their low level English proficiency interferes with content knowledge performance Test results affect decisions regarding promotion or graduation They may be inappropriately placed into special educational programs where they receive inappropriate instruction ELL students may not have received the same curriculum which is assumed for the test General Problems English language learners (ELLs) can be placed at a disadvantage because:

Should Schools Test English Language Learners?  Yes Problems In Large-Scale Assessment: Standardized assessment Assessment tools in large-scale assessments are usually constructed based on norms that exclude ELL populations Research shows major differences between the performance of ELL and non-ELL students on the results of standardized large-scale assessments The tests may be biased in favor of non-ELL populations Performance/alternative assessment Such assessments require more language production; thus students with lower language capabilities are at a greater disadvantage Scorers may not be familiar with rating ELL performance

Problems Due to the powerful impact of assessment on instruction, ELL and SWD students’ quality of instruction may be affected If excluded, they will be dropped out of the accountability picture Institutions will not be held responsible for their performance in school They will not be included in state or federal policy decision Their academic progress, skills, and needs may not be appropriately assessed Should Schools Test English Language Learners?  No

States with the Highest Proportion of ELL Students Percentage of Total Student Population: California27.0 New Mexico19.0 Arizona15.4 Alaska 15.0 Texas14.0 Nevada11.8 Florida10.7

Problems in AYP Reporting: Focus on LEP Students 1.Problems in classification/reclassification of LEP students (moving target subgroup) 2.Measurement quality 3.Low baseline 4.Instability of the LEP subgroup 5.Sparse LEP population 6.LEP cutoff points (Conjunctive vs. Compensatory model)

Site 2 Stanford 9 Sub-scale Reliabilities (1998) Grade 9 Alphas Non-LEP Students Sub-scale (Items)Hi SESLow SES English Only FEPRFEPLEP Reading, N=205,09235,855181,20237,87621,86952,720 -Vocabulary (30) Reading Comp (54) Average Reliability Math, N=207,15536,588183,26238,32922,15254,815 -Total (48) Language, N=204,57135,866180,74337,86221,85252,863 -Mechanics (24) Expression (24) Average Reliability Science, N=163,96028,377144,82129,94617,57040,255 -Total (40) Social Science, N=204,96536,132181,07838,05221,96753,925 -Total (40)

Classical Test Theory: Reliability  2 X =  2 T +  2 E X: Observed Score T: True Score E: Error Score  XX’=  2 T / 2 X  XX’= 1-  2 E / 2 X Textbook examples of possible sources that contribute to the measurement error: 2 Rater Occasion Item Test Form

Classical Test Theory: Reliability  2 X =  2 T +  2 E  2 X =  2 T +  2 E +  2 S +  ES  XX’= 1- (( 2 E +  2 S +  ES )/ 2 X ) 2

Generalizability Theory: Partitioning Error Variance into Its Components s 2 (X pro ) =  2 p +  2 r +  2 o +  2 pr +  2 po +  2 ro +  2 pro,e p: Person r: Rater o: Occasion Are there any sources of measurement error that may specifically influence ELL performance? 3

Grade 11 Stanford 9 Reading and Science Structural Modeling Results (DF=24), Site 3 All Cases (N=7,176) Even Cases (N=3,588) Odd Cases (N=3,588) Non-LEP (N=6,932) LEP (N=244) Goodness of Fit Chi Square NFI NNFI CFI Factor Loadings Reading Variables Composite Composite Composite Composite Composite Math Variables Composite Composite Composite Composite Factor Correlation Reading vs. Math Note. NFI = Normed Fit Index. NNFI = Non-Normed Fit Index. CFI = Comparative Fit Index.

Normal Curve Equivalent Means & Standard Deviations for Students in Grades 10 and 11, Site 3 School District Reading Science Math MSD M SD M SD Grade 10 SWD only LEP only LEP & SWD Non-LEP/SWD All students Grade 11 SWD Only LEP Only LEP & SWD Non-LEP/SWD All Students

SubgroupReadingMathLanguageSpelling LEP Status LEP Mean SD N62,27364,15362,55964,359 Non-LEP Mean SD N244,847245,838243,199246,818 SES Low SES Mean SD N92,30294,05492,22194,505 Higher SES Mean SD N307,931310,684306,176312,321 Site 2 Grade 7 SAT 9 Subsection Scores

ReadingMathMath Calculation Math Analytical Non-LEP/Non- SWD Mean SD N LEP only Mean SD N SWD only Mean SD N LEP/SWD Mean SD N Site 4 Grade 8 Descriptive Statistics for the SAT 9 Test Scores by Strands

Accommodations for SWD/LEP Accommodations that are appropriate for the particular subgroup should be used

Why Should English Language Learners be Accommodated? Their possible English language deficiency may interfere with their content knowledge performance. Assessment tools may be culturally and linguistically biased for these students. Linguistic complexity of the assessment tools may be a source of measurement error. Language factors may be a source of construct irrelevant variance.

SY Accommodations Designated for ELLs Cited in States’ Policies There are 73 accommodations listed: N:Not Related R:Remotely Related M:Moderately Related H:Highly Related From: Rivera (2003) State assessment policies for English language learners. Presented at the 2003 Large-Scale Assessment Conference

N1. Test time increased N2. Breaks provided N3. Test schedule extended N4. Subtests flexibly scheduled N5. Test administered at time of day most beneficial to test-taker N = not related; R = remotely related; M = moderately related; H = highly related I. Timing/Scheduling (N = 5) SY Accommodations Designated for ELLs Cited in States’ Policies

There are 73 Accommodations Listed 47 or 64% are not related 7 or 10% are remotely related 8 or 11% are moderately related 11 or 15% are highly related

A Clear Language of Instruction and Assessment Works for ELLs, SWDs, and Everyone What is language modification of test items?

Examining Complex Linguistic Features in Content-Based Test Items

Familiarity/frequency of non-math vocabulary: unfamiliar or infrequent words changed census > video game A certain reference file > Mack ’ s company Length of nominals: long nominals shortened last year ’ s class vice president > vice president the pattern of puppy ’ s weight gain > the pattern above Question phrases: complex question phrases changed to simple question words At which of the following times > When which is best approximation of the number > approximately how many Linguistic Modification Concerns

Conditional clauses: conditionals either replaced with separate sentences or order of conditional and main clause changed If Lee delivers x newspapers > Lee delivers x newspapers If two batteries in the sample were found to be dead > he found three broken pencils in the sample Relative clauses: relative clauses either removed or re-cast A report that contains 64 sheets of paper > He needs 64 sheets of paper for each report Voice of verb phrase: passive verb forms changed to active The weights of 3 objects were compared > Sandra compared the weights of 3 rabbits If a marble is taken from the bag > if you take a marble from the bag Linguistic Modification cont.

Original: 2. The census showed that three hundred fifty-six thousand, ninety-seven people lived in Middletown. Written as a number, that is: A. 350,697 B. 356,097 C. 356,907 D. 356,970 Modified: 2. Janet played a video game. Her score was three hundred fifty-six thousand, ninety-seven. Written as number, that is: A. 350,697 B. 356,097 C. 356,907 D. 356,970

Interview Study Table 1. Student Perceptions Study: First Set (N=19) Item #Original item chosenRevised item chosen Table 2. Student Perceptions Study: Second Set (N=17) Item #Original item chosenRevised item chosen a

Many students indicated that the language in the revised item was easier: “Well, it makes more sense.” “It explains better.” “Because that one’s more confusing.” “It seems simpler. You get a clear idea of what they want you to do.”

Issues in the ELL Special Education Eligibility  Issues concerning authenticity of English language Proficiency tests  Issues and problems in identifying students with learning disability in general  Distribution of English language proficiency across ELL/non-ELL student categories

Issues concerning authenticity of English language Proficiency tests  Issues in theoretical bases (discrete point approach, holistic approach, Pragmatic approach)  Issues in content coverage (language proficiency standards)  Issues concerning psychometrics of the assessment  Low relationship between ELL classification categories and English proficiency scores

Issues and problems in identifying students with learning disability in general  A large majority of students with disabilities fall in learning disability  Validity of identifying students with learning disability is questionable

Distribution of English language proficiency across ELL/non-ELL student  Most of the existing tests of English proficiency lack enough discrimination power  There is a large number of ELL students perform higher than non- ELL student  The line between ELL and non-ELL on their English proficiency is not a clear line

Reducing the Language Load of Test Items Reducing unnecessary language complexity of test items helps ELL students (and to some extent SWDs) present a more valid picture of their content knowledge. The language clarification of test items may be used as a form of accommodation for English language learners. The results of our research suggest that linguistic complexity of test items may be a significant source of measurement error for ELL students.

Conclusions and Recommendation 1. Classification Issues Classifications of ELLs and SWDs: Must be based on multiple criteria that have predictive power for such classifications These criteria must be objectively defined Must have sound theoretical and practical bases Must be easily and objectively measurable

Conclusions and Recommendation 2. Assessment Issues Assessment for ELLs and SWDs: Must be based on a sound psychometric principles Must be controlled for all sources of nuisance or confounding variables Must be free of unnecessary linguistic complexities Must include sufficient number of ELLs and SWDs in its development process (field testing, standard setting, etc.) Must be free of biases, such as cultural biases Must be sensitive to students’ linguistics and cultural needs

3. Issues concerning special education eligibility particularly in placing ELL students at the lower English language proficiency in the learning/ reading disability category  There are psychometric issues with the English language proficiency tests  Standardized achievement tests may not provide reliable and valid assessment of ELL students  Reliable and valid measures are needed to distinguish between learning disability and low level of English proficiency

Conclusions and Recommendation 4. Accommodation Issues Accommodations: Must be relevant to the subgroups of students Must be effective in reducing the performance gap between accommodated and non-accommodated students Must be valid, that is, accommodations should not alter the construct being measured The results could be combined with the assessments under standard conditions Must be feasible in the national and state assessments

Now for a visual art representation of invalid accommodations…