Jack B. Monpas-Huber, Ph.D. Director of Assessment and Student Information How Do We Know When They’ve Learned It? Guidance for Development of Common Assessments How Do We Know When They’ve Learned It? Guidance for Development of Common Assessments (206) Office (206) Cell
1.The purpose of the common assessment is clear. Formative? Summative? Where in the learning process? 2.What the common assessment is intended to measure is clear. What are the power standards, or learning targets. Develop a test map. 3.The instrument gathers the right kind of data for the learning target. Knowledge, or process skill? Selected-response, or performance assessment? 4.The instrument gathers the same kind of data in a consistent way. If a performance, need to develop a common rubric and agree on its application. The rubric needs to guide the scores, not teacher autonomy or preference. 5.The instrument gathers enough data to provide sufficient evidence of learning, not chance. Three tasks. Features of a quality common assessment
Define power standards What are the big ideas that we expect students to learn in this period of time? Develop an assessment (test) map How to measure the power standards? What kinds of items are appropriate? How large/long should this assessment be? How many tasks do we need to adequately measure mastery? Develop (or populate with existing) items/tasks/scoring rubrics What do you have already? Which standards do they measure? Review and Piloting Are we really measuring what we say we’re measuring? Is anything confusing, ambiguously worded, or biased? Steps in the development process Standard setting What counts as proficiency? Developing common assessments
The importance of the “test map” 1.It lays out a plan for assessing the expectations. 2.It connects expectations to instruments. 3.It ensures that the assessment covers what was taught/expected. 4.It builds consistency into assessment practice. A Test Map from 7th Grade Math Developing common assessments
Standard setting Is…a (judgmental) “process of establishing cut scores on examinations” (Cizek, p. 225) Is not…a “search for a knowable boundary that exists a priori between categories, with the task of standard setting participants simply to discover it” (Cizek, p. 227) Standards must be set because decisions must be made on some basis For established common assessments Recommendation: Some variation of the “bookmark method” 1.Items/tasks (re)ordered by difficulty (based on difficulty data) 2.Judges place a bookmark where they believe the cutoff for proficiency should be
The Split and Switch Design 1 A variation on the traditional pretest-posttest design Collecting good evidence of instructional effectiveness 1 Popham, J. (2001). The truth about testing. Alexandria, VA: Association for Supervision and Curriculum Development. 1.Create two forms of a test, somewhat equal in difficulty. 2.Split class into two halves 3.Half the class takes Form A as pretest, the other half takes Form B as pretest 4.Instruction 5.Switch forms for posttest 6.Blind-score all Form As (pres and posts scrambled), all Form Bs (scrambled) 7.Calculate gains on Form A, Form B Note: Typically you would subtract the pre from the post for each student and then average the gain scores. Not in this case! Instead, you’re subtracting Form A pretest mean from Form A posttest mean--even though those means are based on different students. Can still make inferences about instruction because all the same students have received the same instruction (treatment). Try it!