Presentation is loading. Please wait.

Presentation is loading. Please wait.

Students in the gap(s): Research findings on who they are, what they need, and implications for the 2% flexibility option Gaye Fedorchak New Hampshire.

Similar presentations


Presentation on theme: "Students in the gap(s): Research findings on who they are, what they need, and implications for the 2% flexibility option Gaye Fedorchak New Hampshire."— Presentation transcript:

1 Students in the gap(s): Research findings on who they are, what they need, and implications for the 2% flexibility option Gaye Fedorchak New Hampshire Department of Education Sue Bechard Measured Progress

2 Using Data to Improve Instruction Understand differences between instruction and assessment contexts Investigate component skills underlying grade level expectations Discern students’ assessment needs Make decisions about the 2% flexibility option

3 New England Compact Enhanced Assessment Grant  Funded by US Department of Education  Four states: NH, ME, RI, VT  2005-2007  Challenge: describe students in gap, and design an assessment that will meet needs of students in gap  8 th grade mathematics –8 th grade to look at complexity –Mathematics to avoid reading comprehension issues

4 Project history  Original goals −Identify students in the gaps −Develop varied assessment modules −Pilot/validate assessment modules  Issues we faced –Not easy to identify the gap, or the students –Impossible to develop an assessment without knowing target students’ needs

5 Revised goals Identify students in the gaps through multiple methods, triangulating evidence Define common criteria for identifying students in the gap Plan and develop task module assessment strategies (assessment prototypes) Recommend core components of an assessment structure that would lessen the gaps Disseminate products to others considering assessments for students in the gap

6 Accountability context Project began February 2005 Modified achievement standards announced April 2005 Proposed “2%” regulations released December 2005 Studies were designed before 2%, not in response to 2% Findings speak to needs of all students not effectively assessed in current system, not necessarily dovetailing with 2% definitions

7 Five Studies and a Literature Review Who are the students in the gaps? (Parker & Saxon) Of all the students who are not proficient, how can states identify those who are in the assessment gaps? (Bechard & Godin) What are the attributes of students in the gaps, and how do these students perform? (Bechard & Godin) What issues in the assessments themselves contribute to the gaps? (Dolan) Are there specific aspects of multiple-choice items used in state assessments that contribute to the assessment gaps? (Famularo & Russell)

8 Gap identification process Conduct exploratory interviews with teachers to identify the assessment gaps Review student assessment data Review teacher judgment data Operationalize gap criteria Conduct focused teacher interviews to confirm gap criteria Parker and Saxon: Teacher views of students and assessments Bechard and Godin: Finding the real assessment gaps

9 The process for investigating gap profiles Bechard and Godin: Who are students in gaps? Conduct focused teacher interviews to confirm gap criteria Investigate characteristics of students in gap 1 Investigate characteristics of students in gap 2 Investigate achievement patterns of students in gap 1 Investigate achievement patterns of students in gap 2 Develop profiles of students in gap 1 Develop profiles of students in gap 2

10 Alternative test items Hypothesize alternate test items Decompose items into requisite skills/ knowledge Provide alternative formats: Item format Item content Visuals Multimedia Review with mathematics experts Pilot and evaluate items Russell and Famularo: Utility of a prototype assessment Dolan et. al.: Providing students with choice

11 Literature Review Students likely to be in the gaps: –Mild mental retardation –Learning disabilities –English language learners Target population: middle school Target academic content: mathematics

12 Middle School Math Instruction and Assessment for Students with LD, Students with MMR, & ELL Students: A Review of the Literature Bob Dolan, Boo Murray, and Nicole Strangman

13 Purpose  Comprehensive literature review of research-based practices during instruction and assessment of students with learning disabilities (LD), students with mild mental retardation (MMR), and students who are English language learners (ELL). Instructional techniques include instructional approaches as well as scaffolds and supports used in the classroom. Assessment techniques consider test design and delivery, with emphasis on testing accommodations.  Focus on identifying common approaches, despite large heterogeneity within each group.  Goal to support states in understanding how these students may be represented within a definition of “students in the gap.”  Only considering students who would take general assessment (i.e. not considering students who would qualify for AA-AAS).

14 Included Student Populations  Mild Mental Retardation (MMR) Students identified as having MMR, being “educable mentally retarded,” or described as having mental retardation and an IQ within the range of 50-55 to approximately 70 (DSM-IV).  Learning Disabilities (LD) Students identified as having LD, a specific LD, or a reading disability as determined by the study author(s). Students not diagnosed with a specific LD but having had low performance in math calculations that would presumably meet the IDEA definition for LD. No attempt was made to evaluate the methodology used to diagnose LD (e.g. discrepancy model, response to intervention model); assumed that authors followed the standard IDEA definition of LD.  English Language Learners (ELL) ELL students, English as a second language (ESL) students, and limited English proficient (LEP) students.

15 Conclusions  Large disconnect between the level and types of instructional supports and testing accommodations for all three student populations. Instructional supports focus largely on pedagogical approaches toward reducing barriers to learning. Test designs, modes of administration, and accommodations largely limited to reducing accessibility barriers.  Discontinuity reflects the limited nature of current large-scale assessment techniques and psychometric approaches toward their design. Concern over compromising the validity of test inferences or comparability of scores of students who do and don’t receive such supports.  General failure to consider the heterogeneity of the student population that could significantly impact the effectiveness of test design factors, modes of administration, and accommodations.  As a result, techniques that may be largely successful in allowing these students to learn effectively may not available at the point that students must demonstrate learning.

16 Considerations for Future Research  Approaches toward assessment that better dovetail with the supports students receive instructionally.  Additional research focused on methodologies for test development that consider construct-relevant vs. construct-irrelevant factors, such as Evidence Centered Design (Mislevy et al.) and construct deconstruction.  Approaches that consider individual student differences, such as through application of universal design (Mace et al., 1996) and Universal Design for Learning (Rose & Meyer, 2002) principles, to create and administer tests that consider diverse students from the start (Thompson et al., 2002) and flexible tests that include built-in supports for diverse students (Bryant & Rivera, 1997; Dolan & Hall, 2001; Dolan, Hall, Banerjee, Chun, & Strangman, 2005; Ketterlin-Geller, 2005).

17 Are there specific aspects of multiple-choice items used in state assessments that contribute to the assessment gaps? Famularo and Russell

18 Examining the Utility of a Prototype Assessment for Assessing Students in the Gap Lisa Famularo and Michael Russell, Technology and Assessment Study Collaborative (inTASC)

19 Overview Goal: Develop and pilot-test a prototype for assessing students in the gaps. 4 complex algebra problems were the foundation for the prototype: –Linear Patterns –Equality –Rate of Change –Evaluating an Expression Modifications were made to determine what, if any, changes would enable students to solve them –Changing problem context from words to pictures –Removing the context of the problem –Changing how information was presented –Simplifying the problem

20 Purpose of the study Assess the quality and usefulness of items designed to decompose skills/knowledge required to solve complex problems. Examine the extent to which students who perform well on the complex items also perform well on the decomposed items. Examine the extent to which students in the gap are able to succeed on decomposed items while struggling with the complex items.

21 Gap Definitions Gap 1: The validity gap contains low-scoring students whose teachers rated their performance in class as proficient. In other words, there is a discrepancy between their performance on the test and their teachers’ rating of their proficiency. Gap 2: The relevance gap consists of students who scored in the lowest achievement level on the test and were rated as low performing in class by their teachers. The large-scale assessment, which is aligned to grade- level achievement standards, is not sensitive to their progress, even with appropriate accommodations and effective instruction..

22 Prototype test One test containing 43 MC items Four sets of algebra items: “item families” One item family each from these stems within the algebra strand: –Linear Patterns, –Evaluating an Equation, –Equality, –Rate of Change Each item family contained: –One “Parent” item taken from the NECAP grade 8 math test, –One isomorph “Sibling” representation of the parent, –10-11 Deconstructed component items: “Child Items” Data were collected (Spring 2006) from 2,365 8 th grade students from schools in NH, RI, and VT.

23 Process used to develop item families Criteria used to select the “Parent” items: Complex problems that required multiple skills/concepts. Of moderate difficulty (item difficulty range:.56 to.66) Criteria for developing “Child” items: Developed to probe students’ understanding of the component skills both individually and in combination. Alternate representations of parent items were developed to determine if modifications to the original problem would enable students to solve it.

24 Example Illustrates the impact of: –Simplifying the values –Removing the context of the problem

25 Simplifying the Values

26

27 Gap 1: 53% Gap 2: 51% Gap 1: 46% Gap 2: 38%

28 Simplifying the Values & Removing the Context

29

30 Gap 1: 77% Gap 2: 75% Gap 1: 53% Gap 2: 51%

31 Findings A factor analysis revealed that the items do cluster together by family and reliability analysis showed moderate to high internal consistency with reliability coefficients

32 Findings As expected, most of the child items were easier than their parent. We expected that within each family the parent and sibling items would have roughly the same item difficulty but our analyses revealed that the siblings were easier. (Practice effect?)

33 Findings Students who answered a parent item correctly usually answered child items in that family correctly Students who answered a parent item incorrectly were less consistent in their performance on the child items – sometimes correct, sometimes incorrect Below grade level items reduced performance differences between gap and comparison groups more than grade level items. However, some grade level items in the rate of change and linear patterns families did not appear to reduce the gap in performance as much as the on grade level items.

34 Findings In the equality family, the two items that reduce the gap in performance the most differ from the parent in that they are single-step rather than multi-step problems. Removing the problem situation does not reduce the gap in performance but simplifying the problem (by using whole numbers or variables equal to 1) and having the student demonstrate understanding of algebraic expressions without requiring them to solve an equation might reduce the gap. Simplifying the problems presented appeared to enable some students in the gap to solve them correctly. In many cases, however, simplifying the problem resulted in items that were considered below grade level.

35 Summary Findings suggest that the dual goal of: Measuring student achievement relative to grade level expectations and Providing teachers with information about what students can and cannot do –might be accomplished through a modular test design that employed “parent” items to measure student achievement relative to grade level achievements and “child” items to measure component skills required to accurately answer complex parent items.

36 Summary: Item Changes found to have a positive effect on gap student performance: (Impact was typically greater for Gap 1 students than for Gap 2.) –Simplify by using whole numbers –Using whole numbers & removing the context –Simplifying information in the table (for Gap 1) –Identifying the correct algebraic expression as opposed to solving an equation

37 Summary Item Changes found to be Not Effective for Gap students: –Changing the table format from vertical to horizontal –Removing the context & changing the numerical sequence from decreasing to increasing Item Changes that need more study: –Removing the problem context – sometimes reduces difficulty, sometimes not – how does this happen? –Changing problem context from abstract symbols to pictures (instead of removing problem context).

38 Recommendations for future research Item design research is needed: Need well described, clearer categories of component (child) items that produce lower difficulty for gap 1 and gap 2 kids: Some questions to answer - –What does it mean to “remove” context from a math problem? It did not seem to reduce difficulty for gap kids reliably and when it did, the problem was also below grade level. Does removing context (as described here) cause linguistic complexity to decrease, or does it cause linguistic complexity to increase? What features control this impact? –If not removed, how can presentation context be changed? Can changing from words to pictures help reduce problem difficulty (or linguistic complexity) without lowering grade level? –What does it mean to “simplify” a problem? Does this primarily impact the memory load students are handling while solving a problem? If so, how else might we reduce memory load during testing without altering test construct or reducing grade level? Policy Question: –If a student could solve each component part of a problem, but not all parts of the problem at once – should it count as grade level?

39 Who are the students in the gaps? Parker and Saxon

40 Gaps in Large-Scale Assessment: Teacher Views Caroline E. Parker, EDC Susan Saxon, Ed Alliance at Brown

41 Background Single exploratory study Findings in two areas: –Teacher views of students –Teacher views of assessments Two separate exploratory papers –Student characteristics triangulated in other studies –Teacher views of assessments still exploratory, not triangulated

42 Methods 40 teachers/administrators from ME, NH, RI, VT –23 mathematics teachers –14 special education teachers –3 administrators with special education and mathematics expertise Semi-structured interviews Convenience sample Total of 9 schools and 1 district (teachers from 3 schools) Analysis: iterative coding

43 Explaining the gap between classroom achievement and assessment results

44 Conclusion Two assessment gaps –First gap includes students who appear to be proficient in class, not proficient on assessment –Second gap includes students far below grade level in class, very low scores on assessment –Both gaps include students with disabilities, ELLs and general education students, but in different percentages. Three assessment characteristics: –Structure –Relationship/Scaffolding –Relevance Study could not delineate between assessment gap and instruction gap and the effects of teacher expectations, content coverage, and opportunities to learn.

45 Of all the students who are not proficient, how can states identify those who are in the assessment gaps? What are the attributes of students in the gaps, and how do these students perform? Bechard and Godin

46 Identifying the gaps in state assessment systems Sue Bechard and Ken Godin Measured Progress

47 Data sources State assessment data – grade 8 mathematics results from two systems General large-scale test results Demographics (special programs, ethnicity, gender) Teachers’ judgments of students’ classroom work Student questionnaires completed at time of test Accommodations used at time of test State data bases for additional student demographic data Disability classification Free/reduced lunch Attendance Student-focused teacher interviews

48 Why use teacher judgment of students’ classroom performance? Validity Gap: the test may not reflect classroom performance Teachers see students performing proficiently in class, but test results are below proficient. Relevance Gap: the test may not be relevant for instructional planning Teachers rate students’ class work as low as possible and test results are at “chance” level. No information is generated on what students can do.

49 Teacher judgment instructions The instructions were clear that this was to be a judgment of the student’s demonstrated achievement on GLE-aligned academic material in the classroom, not a prediction of test performance. The teacher judgment field consisted of 12 possibilities – each of the 4 achievement levels had low, medium, and high divisions.

50 Research on validity of teacher judgment While there are some conflicting results, the most accurate judgments were found when: teachers were given specific evaluation criteria levels of competency were clearly delineated criterion-referenced tests in mathematics or reading were the matching measure criterion-referenced tests reflected the same content as did classroom assessments judgments were of older students who had no exceptional characteristics, and teachers were asked to assign ratings to students, not to rank- order them

51 Validation of teacher judgment data Data collected to establish as “Round 1” cutpoints (of 3 rounds) during standard-setting. Validation studies were conducted which asked: Were there differences between the sample of students with non-missing teacher judgments data and the rest of the population? Were there suspicious trends in the judgment data suggesting that teachers did not take the task seriously? How did teacher judgments compare with students’ actual test scores? Results of these investigations were considered supportive of using the teacher judgment data for standard setting.

52 Teacher judgment vs. test performance

53 Operationalizing the gap definitions using teacher judgment

54

55 Student questionnaires (answered after taking the test) 1. How difficult was the mathematics test? A. harder than my regular mathematics schoolwork B. about the same as my regular mathematics schoolwork C. easier than my regular mathematics schoolwork 2. How hard did you try on the mathematics test? A. I tried harder on this test than I do on my regular mathematics schoolwork. B. I tried about the same as I do on my regular mathematics schoolwork. C. I did not try as hard on this test as I do on my regular mathematics schoolwork

56 Accommodations (used during the mathematics test) 16 accommodations listed by category: Setting Scheduling/timing Presentation formats Response formats

57 Student-focused teacher interviews Student profile data math test scores (both overall and on subtests) specific responses to released math test items student’s responses to the questionnaire special program status accommodations used during testing Teacher interview questions Questions regarding perceptions of the students in each gap on various aspects of gap criteria, 17 Likert scale questions on the student’s class work and participation in classroom activities.

58 Student-focused teacher interview samples 20 8th grade math and special ed teachers 7 schools across three states (NH, RI, and VT). 51 students: gap 1=19, gap 2=18, and comparison group=14.

59 Results: Percentages of students in the gaps Relevance gap 2 and non-gap 2 percentages are different when fine or gross grained ratings are used.

60 Accommodations use Students in validity gap 1 were significantly less likely to use accommodations than students in non-gap 1. Only a small percentage of students in validity gap 1 used any accommodations at all. The majority of students in both relevance gap 2 and non-gap 2 used one or more accommodations.

61 Performance of students in validity gap 1 compared to non-gap 1 + Statistically higher than expected - Statistically lower than expected

62 Special program status of students in validity gap 1 The majority of students in validity gap 1 were in general education. Students with IEPs were under-represented in validity gap 1 and over-represented in non-gap 1. + Statistically higher than expected - Statistically lower than expected

63 Disability designations in validity gap Learning disabilities Validity Gap 1: 57.7% of the IEP gap 1 group (n=208) Non-gap 1: 49.7% of the IEP non-gap 1 group (n=860) Comparison: 49.2% of the IEP comparison group (n=83) Total population: 52% of students with IEPs (N=4,465) Disability designations only seen in non-gap 1 : Students with learning impairments (MR), deafness, multiple disabilities and traumatic brain injury

64 Additional characteristics of students in validity gap 1 compared to non-gap 1 Validity gap students: Were more likely female and white Had the fewest absences Had higher SES Found the state test about the same level of difficulty as class work Exhibited academic and mathematics-appropriate behaviors in class

65 Performance of students in relevance gap 2 on the test By definition, students in both relevance gap 2 and non-gap 2 scored no better than chance on the assessment.

66 Special program status of students in relevance gap 2 The vast majority of students in relevance gap 2 and non-gap 2 were students with IEPs.

67 Disability designations in relevance gap Learning disabilities: Fewer than half of the students in relevance gap 2 groups had learning disabilities Students who were deaf/blind and those with multiple disabilities were only found in gap 2. Students with hearing impairments, deafness and traumatic brain injury were only found in non-gap 2.

68 Additional characteristics of students in relevance gap 2 compared to non- gap 2 Students in relevance gap 2 were very similar to students in non- gap 2 on most variables. Students from both groups felt that the test was as hard as or harder than their schoolwork. They tried as hard as or harder on the test as in class. They used mathematics tools in the classroom (e.g., calculators).

69 Summary: How many students are in the gaps? 10.9% - 11.4% of the total student population in two systems are in assessment gaps. NECAP Validity Gap 1 = 8.6% Relevance Gap 2 = 2.3% MEA Validity Gap 1 = 7.1% Relevance Gap 2 = 4.3%

70 Summary We found substantial differences between the composition of the validity gap 1 groups, which held in both NECAP and MEA systems. Validity gap 1 students may have characteristics and behaviors that mask their difficulties. Non-gap 1 students are those generally thought to be in the “achievement gap”.

71 Summary (cont.) Low performing students in relevance gap 2 and non-gap 2 share many characteristics. Their extremely low performances in both classroom activities and the test raise issues about the relevancy of the general assessment for them.

72 Conclusions For students in validity gap 1, increase focus on classroom supports and training on how to transfer their knowledge and skills from classroom to assessment environments. For students in non-gap 1, examine expectations and opportunities to learn. Providing a different test based on modified academic achievement standards is premature. Students with IEPs in relevance gap 2 and non-gap 2 may benefit from the 2% option for AYP and an alternate assessment based on modified academic achievement standards (AA-MAAS). There will be challenges designing a test based on MAAS that is strictly aligned with grade level content.

73 www.necompact.org www.measuredprogress.org Gaye Fedorchak GFedorchak@ed.state.nh.us Sue Bechard sbechard@measuredprogress.org


Download ppt "Students in the gap(s): Research findings on who they are, what they need, and implications for the 2% flexibility option Gaye Fedorchak New Hampshire."

Similar presentations


Ads by Google