Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 New England Common Assessment Program Item Review Committee Meeting March 30, 2005 Portsmouth, NH.

Similar presentations


Presentation on theme: "1 New England Common Assessment Program Item Review Committee Meeting March 30, 2005 Portsmouth, NH."— Presentation transcript:

1 1 New England Common Assessment Program Item Review Committee Meeting March 30, 2005 Portsmouth, NH

2 2 Tim Kurtz – Director of Assessment New Hampshire Department of Education Michael Hock – Director of Educational Assessment Vermont Department of Education Mary Ann Snider – Director of Assessment & Accountability Rhode Island Department of Education Tim Crockett – Assistant Vice President Measured Progress Welcome and Introductions

3 3 Meeting Agenda Committee Member Expense Reimbursement Form Substitute Reimbursement Form NECAP Nondisclosure Form Handouts of presentations Logistics

4 4 Test Development: Past, Present & Future How we got here? – Tim Kurtz, NH DoE Statistical Analyses – Tim Crockett, MP Bias/Sensitivity – Michael Hock, VT DoE Depth of Knowledge – Ellen Hedlund, RI DoE Betsy Hyman, RI DoE 2005-2006 Schedule – Tim Kurtz, NH DoE So, what am I doing here? Morning Agenda

5 5 How did we get to where we are today? Tim Kurtz Director of Assessment New Hampshire Department of Education Item Review Committee

6 6 1st Bias Committee meeting – March 1st Item Review Committee meeting – April 2nd Item Review Committee meeting – July 2nd Bias Committee meeting – July Face-to-Face meetings – August Test Form Production and DOE Reviews – August NECAP Pilot Review 2004-05

7 7 Reading and Mathematics Printing and Distribution – September Test Administration Workshops – October Test Administration – October 25 – 29 Scoring – December Data Analysis & Item Statistics – January Teacher Feedback Review – February Has affected item review, accommodations, style guide and administration policies Item Selection meetings – February & March NECAP Pilot Review 2004-05

8 8 Writing Printing and Distribution – December & January Test Administration – January 24 - 28 Scoring – March Data Analysis & Item Statistics – April Item Selection meetings – April & May NECAP Pilot Review 2004-05

9 9 What data was generated from the pilot and what do we do with it? Tim Crockett Assistant Vice President Measured Progress NECAP Pilot Review 2004-05

10 10 Item Statistics ●The review of data and items is a judgmental process ●Data provides clues about the item ●Difficulty ●Discrimination ●Differential Item Functioning

11 11 At the top of each page...

12 12 The Item and any Stimulus Material

13 13 Item Statistics Information

14 14 Item Difficulty (multiple-choice items) ●Percent of students with a correct response. Range is from.00 to 1.00 0.00 1.00 Difficult Easy ●NECAP needs a range of difficulty, but below.30 may be too difficult above.80 may be too easy

15 15 Item Difficulty (constructed-response items) Average score on the item. Range is from.00 to 2.00 or 0.00 to 4.00 On 2-point items below 0.4 may be too difficult above 1.6 may be too easy On 4-point items below 0.8 may be too difficult above 3.0 may be too easy

16 16 Item Discrimination ●How well an item separates higher performing students from lower performing students ●Range is from -1.00 to 1.00 ●The higher the discrimination the better ●Items with discriminations below.20 may not be effective and should be reviewed

17 17 Other Discrimination Information: (multiple-choice items)

18 18 Differential Item Functioning ● DIF (F-M) – females compared to males who performed the same on the test are compared on their performance on the item ●positive number reflects females scoring higher ●negative number reflects males scoring higher ●NS means no significant difference

19 19 Item Statistics Information

20 20 Differential Item Functioning Multiple Choice HighLow Negligible LowHigh C B AA B C < 0.100.050.00-0.05-0.10 > FemaleMale -Dorans and Holland, 1993 -For CR items: –.20 or +.20 represents negligible DIF >–.30 or +.30 represents low DIF >–.40 or +.40 represents high DIF

21 21 How do we insure that this test works well for students from diverse backgrounds? Michael Hock Director of Educational Assessment Vermont Department of Education Bias/Sensitivity Review

22 22 What Is Item Bias? Bias is the presence of some characteristic of an assessment item that results in the differential performance of two individuals of the same ability but from different student subgroups Bias is not the same thing as stereotyping although we don’t want either in NECAP We need to ensure that ALL students have an equal opportunity to demonstrate their knowledge and skills

23 23 Item Development Bias-Sensitivity Review Item Review Field-Testing Feedback Pilot-Testing Data Analysis (DIF) How Do We Prevent Item Bias?

24 24 Sensitivity to different cultures, religions, ethnic and socio-economic groups, and disabilities Balance of gender roles Use of positive language, situations and images In general, items and text that may elicit strong emotions in specific groups of students, and as a result, may prevent those groups of students from accurately demonstrating their skills and knowledge Role of the Bias-Sensitivity Review Committee The Bias-Sensitivity Review Committee DOES need to make recommendations concerning…

25 25 Reading Level Grade Level Appropriateness GLE Alignment Instructional Relevance Language Structure and Complexity Accessibility Overall Item Design Role of the Bias-Sensitivity Review Committee The Bias-Sensitivity Review Committee DOES NOT need to make recommendations concerning…

26 26 Passage Review Rating Form “This passage does not raise bias and/or sensitivity concerns that would interfere with the performance of a group of students”

27 27 Universal Design Improved Accessibility through Universal design

28 28 Universal Design Improved Accessibility through Universal design  Inclusive assessment population  Precisely defined constructs  Accessible, non-biased items  Amenable to accommodations  Simple, clear, and intuitive instructions and procedures  Maximum readability and comprehensibility  Maximum legibility

29 29 How do we control item complexity? Ellen Hedlund and Betsy Hyman Office of Assessment and Accountability Rhode Island Department of Elementary and Secondary Education Item Complexity

30 Depth of Knowledge A presentation adapted from Norman Webb for the NECAP Item Review Committee March 30, 2005

31 31 Bloom Taxonomy Knowledge Recall of specifics and generalizations; of methods and processes; and of pattern, structure, or setting. Comprehension Knows what is being communicated and can use the material or idea without necessarily relating it. Applications Use of abstractions in particular and concrete situations. Analysis Make clear the relative hierarchy of ideas in a body of material or to make explicit the relations among the ideas or both. Synthesis Assemble parts into a whole. Evaluation Judgments about the value of material and methods used for particular purposes.

32 32 U.S. Department of Education Guidelines Dimensions important for judging the alignment between standards and assessments Comprehensiveness: Does assessment reflect full range of standards? Content and Performance Match: Does assessment measure what the standards state students should both know & be able to do? Emphasis: Does assessment reflect same degree of emphasis on the different content standards as is reflected in the standards? Depth: Does assessment reflect the cognitive demand &depth of the standards? Is assessment as cognitively demanding as standards? Consistency with achievement standards: Does assessment provide results that reflect the meaning of the different levels of achievement standards? Clarity for users: Is the alignment between the standards and assessments clear to all members of the school community?

33 33 The demand on thinking the items requires: Low Complexity Relies heavily on the recall and recognition of previously learned concepts and principles. Moderate Complexity Involves more flexibility of thinking and choice among alternatives than do those in the low-complexity category. High Complexity Places heavy demands on students, who must engage in more abstract reasoning, planning, analysis, judgment, and creative thought. Mathematical Complexity of Items NAEP 2005 Framework

34 34 Depth of Knowledge (1997) Level 1 Recall Recall of a fact, information, or procedure. Level 2 Skill/Concept Use information or conceptual knowledge, two or more steps, etc. Level 3 Strategic Thinking Requires reasoning, developing plan or a sequence of steps, some complexity, more than one possible answer. Level 4 Extended Thinking Requires an investigation, time to think and process multiple conditions of the problem.

35 35

36 36

37 37

38 38

39 39

40 40 Practice Exercise Read the passage, The End of the Storm Read and assign a DOK to each of the 5 test questions Form groups of 4-5 to discuss your work and reach consensus of a DOK for each test question

41 41 Issues in Assigning Depth-of-Knowledge Levels Variation by grade level Complexity vs. difficulty Item type (MC, CR, ER) Central performance in objective Consensus process in training Aggregation of DOK coding Reliabilities

42 42 Web Sites http://facstaff.wcer.wisc.edu/normw/ Alignment Tool http://www.wcer.wisc.edu/WAT/index.aspx Survey of the Enacted Curriculum http://www.SECsurvey.org

43 43 What is the development cycle for this year? What is your role in all this? Tim Kurtz Director of Assessment New Hampshire Department of Education NECAP Operational Test 2005-06

44 44 1st Bias Committee meeting – March 8-9 18 teachers – 6 from each state 1st Item Review Committee meeting – March 30 72 teachers – 12 from each state in each content area 2nd Item Review Committee meeting – April 27-28 Practice Test on DoE website – early May 2nd Bias Committee meeting – May 3-4 Face-to-Face meetings – May 25-27 & June 1-3 Test Form Production and DOE Reviews – July NECAP Operational Test 2005-06

45 45 Printing – August Test Administration Workshops – Aug & Sept Shipments to schools – September 12-16 Test Administration Window – October 3-21 204,000 students and 25,000 teachers from the 3 states Scoring – November Standard Setting – December Teachers and educators from the three states Reports shipped to schools – Late January NECAP Operational Test 2005-06

46 46 This assessment has been designed to support a quality program in mathematics and English language arts. It has been grounded by the input of hundreds of NH, RI, and VT educators. Because we intend to release assessment items each year, the development process continues to depends on the experience and professional judgment and wisdom of classroom teachers from our three states. TIRC – So, why are we here?

47 47 We have worked hard to get to this point. Today, you will be looking at passages in reading and some items in mathematics. The role of Measured Progress staff is to keep the work moving along productively. The role of DoE content specialists is to listen and ask clarifying questions as necessary. TIRC – Our role.

48 48 You are here today to represent your diverse contexts. We hope that you… share your thoughts vigorously, and listen just as intensely – we have different expertise and we can learn from each other, use the pronouns “we” and “us” rather than “they” and “them” – we are all working together to make this the best assessment possible, and grow from this experience – I know we will. And we hope that today will be the beginning of some new interstate friendships. TIRC – Your role?

49 49 Tim KurtzDirector of Assessment NH Department of Education TKurtz@ed.state.nh.us (603) 271-3846 Mary Ann SniderDirector of Assessment and Accountability Rhode Island Department of Elementary and Secondary Education MaryAnn.Snider@ride.ri.gov (401) 222-4600 ext. 2100 Michael HockDirector of Educational Assessment Vermont Department of Education MichaelHock@education.state.vt.us (802) 828-3115 Information, Questions and Comments


Download ppt "1 New England Common Assessment Program Item Review Committee Meeting March 30, 2005 Portsmouth, NH."

Similar presentations


Ads by Google