Presentation is loading. Please wait.

Presentation is loading. Please wait.

C R E S S T / U C L A Psychometric Issues in the Assessment of English Language Learners Presented at the: CRESST 2002 Annual Conference Research Goes.

Similar presentations


Presentation on theme: "C R E S S T / U C L A Psychometric Issues in the Assessment of English Language Learners Presented at the: CRESST 2002 Annual Conference Research Goes."— Presentation transcript:

1

2 C R E S S T / U C L A Psychometric Issues in the Assessment of English Language Learners Presented at the: CRESST 2002 Annual Conference Research Goes to School Assessment, Accountability, and Improvement Jamal Abedi UCLA Graduate School of Education National Center for Research on Evaluation, Standards, and Student Testing (CRESST) September 10-11, 2002

3 C R E S S T / U C L A Measurement /Psychometric Theory Do the same underlying measurement theories used for the mainstream assessment apply equally to English language learners? Yes  No  Do the psychometric textbooks have enough coverage of issues concerning measurement of ELLs? Yes  No  Are there specific measurement issues that are unique to the assessment of ELLs? Yes  No  Can the low performance of ELLs in content-based areas be explained mainly by the lack of their content knowledge? Yes  No  Are there any extraneous variables that could specifically impact the performance of ELLs? Yes  No  1

4 C R E S S T / U C L A Psychometric Methods Development and application of modern mental measures by Steven J. Osterlind, University of Missouri Chapter 3. Classical Measurement Theory Supposed a sample population of examinees is comprised of individuals from two different cultures, in one culture dogs are considered as close family members and in other culture, dogs are considered non-family and meant for work. Now suppose that some of the reading test questions incidentally describe the treatment of dogs. Remember, this is a test of one’s reading ability, not a test about dogs. Abedi, J., & Lord, C. (2001). The Language Factor in Mathematics Tests. Applied Measurement in Education, 14(3), 219-234. 2

5 C R E S S T / U C L A Classical Test Theory: Reliability  2 X =  2 T +  2 E X: Observed Score T: True Score E: Error Score  XX’=  2 T /  2 X  XX’= 1-  2 E /  2 X Textbook examples of possible sources that contribute to the measurement error: Rater Occasion Item Test Form 3

6 C R E S S T / U C L A Assumptions of Classical True-Score Theory 4 1. X = T + E 2.  (x) = T 3.  ET = 0 4.  E 1 E 2 = 0 5.  E 1 T 2 = 0

7 C R E S S T / U C L A Generalizability Theory: Partitioning Error Variance into Its Components  2 (X pro ) =  2 p +  2 r +  2 o +  2 pr +  2 po +  2 ro +  2 pro,e p: Person r: Rater o: Occasion Are there any sources of measurement error that may specifically influence ELL performance? 5 There may be other sources such as: test forms, test instructions, item difficulty, and test-taking skills.

8 C R E S S T / U C L A Validity of Academic Achievement Measures We will focus on construct and content validity approaches: A test’s content validity involves the careful definition of the domain of behaviors to be measured by a test and the logical design of items to cover all the important areas of this domain (Allen & Yen, 1979, p. 96). A test’s construct validity is the degree to which it measures the theoretical construct or trait that it was designed to measure (Allen & Yen, 1979, p. 108). Examples: A content-based achievement test has construct validity if it measures the content that it is supposed to measure. A content-based achievement test has content validity if the test content is representative of the content being measured. 6

9 C R E S S T / U C L A Study #2 Interview study (Abedi, Lord, & Plummer, 1997). 37 students asked to express their preference between the original NAEP items and the linguistically modified version of these same items. Math test items were modified to reduce the level of linguistic complexity. Finding l Over 80% interviewed preferred the linguistically modified items over the original version. 7

10 C R E S S T / U C L A 8 Study #3 Impact of linguistic factors on students’ performance ( Abedi, Lord, & Plummer, 1997). Two studies: test performance and speed. SAMPLE: 1,031 grade 8 ELL and non-ELL students. 41 classes from 21 southern California schools. Finding l ELL students who received a linguistically modified version of the math test items performed significantly better than those receiving the original test items.

11 C R E S S T / U C L A Study #4 The impact of different types of accommodations on students with limited English proficiency (Abedi, Lord, & Hofstetter, 1997). SAMPLE: 1,394 grade 8 students. 56 classes from 27 southern California schools. 9 Finding Spanish translation of NAEP math test. l Spanish-speakers taking the Spanish translation version performed significantly lower than Spanish-speakers taking the English version. l We believe that this is due to the impact of language of instruction on assessment. Linguistic Modification l Contributed to improved performance on 49% of the items. Extra Time l Helped grade 8 ELL students on NAEP math tests. l Also aided non-ELL students. Limited potential as an assessment accommodation.

12 C R E S S T / U C L A Study #5 Impact of selected background variables on students’ NAEP math performance. (Abedi, Hofstetter, & Lord, 1998). SAMPLE: 946 grade 8 ELL and non-ELL students. 38 classes from 19 southern California schools. 10 Finding Four different accommodations used (linguistically modified, a glossary only, extra time only, and a glossary plus extra time). The glossary plus extra time was the most effective accommodation. Glossary plus extra time accommodation l Non-ELLs showed a greater improvement (16%) than the ELLs (13%). l This is the opposite of what is expected and casts doubt on the validity of this accommodation.

13 C R E S S T / U C L A 11 Study #8 Language accommodation for large-scale assessment in science (Abedi, Courtney, & Leon, 2001). SAMPLE: 1,856 grade 4 and 1,512 grade 8 ELL and non-ELL students. 132 classes from 40 school sites in four cities, three states. Finding l Results suggested: linguistic modification of test items improved performance of ELLs in grade 8. l No change on the performance of non-ELLs with modified test. l The validity of assessment was not compromised by the provision of an accommodation.

14 C R E S S T / U C L A Study #9 Impact of students’ language background on content-based performance: analyses of extant data (Abedi & Leon, 1999). Analyses were performed on extant data, such as Stanford 9 and ITBS SAMPLE: Over 900,000 students from four different sites nationwide. 12 Study #10 Examining ELL and non-ELL student performance differences and their relationship to background factors (Abedi, Leon, & Mirocha, 2001). Data were analyzed for the language impact on assessment and accommodations of ELL students. SAMPLE: Over 700,000 students from four different sites nationwide. Finding l The higher the level of language demand of the test items, the higher the performance gap between ELL and non-ELL students. l Large performance gap between ELL and non-ELL students on reading, science and math problem solving (about 15 NCE score points). l This performance gap was reduced to zero in math computation.

15 C R E S S T / U C L A Normal Curve Equivalent Means and Standard Deviations for Students in Grades 10 and 11, Site 3 School District Reading Science Math MSD M SD M SD Grade 10 SD only16.412.725.513.322.511.7 LEP only24.016.432.915.336.816.0 LEP & SD16.311.224.8 9.323.6 9.8 Non-LEP & SD38.016.042.617.239.616.9 All students36.016.941.317.538.517.0 Grade 11 SD Only14.913.221.512.324.313.2 LEP Only22.516.128.414.445.518.2 LEP & SD15.512.726.120.125.113.0 Non-LEP & SD38.418.339.618.845.221.1 All students36.219.038.218.944.021.2 Note. LEP = limited English proficient. SD = students with disabilities. 13

16 C R E S S T / U C L A Disparity Index (DI) was an index of performance differences between LEP and non-LEP. Site 3 Disparity Index (DI) Non-LEP/Non-SD Students Compared to LEP-Only Students Disparity Index (DI) Math Math Grade Reading Math Total Calculation Analytical 3 53.425.812.932.8 6 81.637.622.246.1 8125.236.925.2 44.0 14

17 C R E S S T / U C L A 15

18 C R E S S T / U C L A 16

19 C R E S S T / U C L A Generalizability Theory: Language as an additional source of measurement error  2 (X prl ) =  2 p +  2 r +  2 l +  2 pr +  2 pl +  2 rl +  2 prl,e p: Person r: Rater l: Language Are there any sources of measurement error that may specifically influence ELL performance? 17

20 C R E S S T / U C L A Main effect language factors  2 l ( Different level of English/Native language proficiency)  Interactions of language factors with other factors è  2 pl (Different level of English/Native language proficiency) è  2 rl (Differential treatment of ELL students by raters with different background)   2 prl,e (A combination of different level of language proficiency, interaction of rater with language and subjects, and unspecified sources of measurement error) 18

21 C R E S S T / U C L A Issues and problems in classification of students with limited English proficiency 19

22 C R E S S T / U C L A Correlation between LAS rating and LEP classification for Site 4 Correlation G2 G3 G4 G5 G6 G7 G8 G9 G10 G11 G12 Pearson r.223.195.187.199.224.261.252.265.304.272.176 Sig (2-tailed).000.000.000.000.000.000.000.000.000.000.000 N 587 721 621 1002 803 938 796 1102 945 782 836 20 Findings The relationship between language proficiency test scores and LEP classification. Since LEP classification is based on students’ level of language proficiency and because LAS is a measure of language proficiency, one would expect to find a perfect correlation between LAS scores and LEP levels (LEP versus non-LEP). The results of analyses indicated a weak relationship between language proficiency test scores and language classification codes (LEP categories).

23 C R E S S T / U C L A Correlation coefficients between LEP classification code and ITBS subscales for Site 1 Grade Reading Math Concept Math Problem Math & Estimation Solving Computation Grade 3 Pearson r -.160 -.045 -.076.028 Sig (2-tailed).000.000.000.000 N 36,006 35,981 35,948 36,000 Grade 6 Pearson r -.256 -.154 -.180 -.081 Sig (2-tailed).000.000.000.000 N 28,272 28,273 28,250 28,261 Grade 8 Pearson r -.257 -.168 -.206 -.099 Sig (2-tailed).000.000.000.000 N 25,362 25,336 25,333 25,342 21


Download ppt "C R E S S T / U C L A Psychometric Issues in the Assessment of English Language Learners Presented at the: CRESST 2002 Annual Conference Research Goes."

Similar presentations


Ads by Google