Jean-Guy Blais Université de Montréal Methodological aspects related to establishing minimum standards for performance Jean-Guy Blais Université de Montréal Neuchâtel / J.-G Blais january 2008
Neuchâtel / J.-G Blais january 2008 Standards What is a standard ? Just enough Average Excellence All of the above Neuchâtel / J.-G Blais january 2008
Educational standards Manufacturing quality standard Health standards Environmental standards Educational standards Neuchâtel / J.-G Blais january 2008
Neuchâtel / J.-G Blais january 2008 Standards Education Systems, schools, teachers and students Nowadays Explicit and public standards Large-scale assessment Minimal standards for all : NCLB / AYP Fairness and accomodation Performance standards and tasks Neuchâtel / J.-G Blais january 2008
Standards Related terminology / research: Mastery assessment Criterion-referenced measurement Cut-off scores Classification Neuchâtel / J.-G Blais january 2008
Neuchâtel / J.-G Blais january 2008 Standards System’s Goals and student’s competencies Values Reforms / trends Standards Test items / Performance tasks Ratings of accomplished tasks according to standards Scoring model / scoring scale Compensatory model Conjunctive model Decision / Consequences Report Press / TV… Neuchâtel / J.-G Blais january 2008
Neuchâtel / J.-G Blais january 2008 Standards Hambleton 1980 : 16 methods Judgmental Empirical Combination «…a point on a test score scale that is used to sort examinees into two categories that reflect different level of proficiency…» Neuchâtel / J.-G Blais january 2008
Neuchâtel / J.-G Blais january 2008 Standards Jaeger 1989 Examinee centered Test centered Kane 1994: «…the performance standard is defined as the minimally adequate level of performance, …, it is the conceptual version of the desired level of competence, and the passing score is the operational version.» Berk 1995 : 20 methods Applied Measurement in Education, 1995 8(1): 50 methods Neuchâtel / J.-G Blais january 2008
Neuchâtel / J.-G Blais january 2008 Hambleton & Pitoniak 2006 : 25 methods Review of items and scoring rubric Review of candidates Review of candidate’s work Review of scores profiles Cizek & Bunch 2007 : 15 methods «…the process of establishing one or more cut scores on examinations.» Procedural process / sound technically Substantive process / fair decision Neuchâtel / J.-G Blais january 2008
Generic steps (Cisek & Bunch 2007) Standards Generic steps (Cisek & Bunch 2007) Choose a method Performance level labels / descriptions Select a panel Train participants Compile ratings / more than one round Review / consensus Document the process / Validity Neuchâtel / J.-G Blais january 2008
Neuchâtel / J.-G Blais january 2008 Standards Many studies, many reviews, 1978-2007: Regression and correlation studies Generalizability studies IRT studies Rasch studies Main feature : human judgment Neuchâtel / J.-G Blais january 2008
Different methods….different results !! Standards Many methods… Different methods….different results !! Neuchâtel / J.-G Blais january 2008
Neuchâtel / J.-G Blais january 2008 Standards Brennan 1996: Performance tasks / GStudy Task reliability is relatively small Equating scores on different performance tasks is difficult Rater reliability/consistency is not always good Haertel and Linn 1996 : «Equating test score when examinees choose which problems to attempt depends on strong untestable assumptions.» Neuchâtel / J.-G Blais january 2008
Neuchâtel / J.-G Blais january 2008 Standards Resnick and Resnick 1996 : «Because the learning of skills and concepts is partly constrained by social contingencies and partly constrained by the curriculum and the instructional process, definition of standards will always be a mixture of our understanding of the learning process and our values.» Neuchâtel / J.-G Blais january 2008
Neuchâtel / J.-G Blais january 2008 Standards ….«Much of the research and at least 30 years of operational standard setting studies lead to one conclusion: making judgements about item difficulties is neither natural nor can panellists be trained readily to make these judgments.» Hand 1997 : «What is the best classification rule ? The answer is it depends.» Neuchâtel / J.-G Blais january 2008
Neuchâtel / J.-G Blais january 2008 Standards Cizek & Bunch 2007: «The same methods used with equivalent groups of participants can produced different cut scores, sometimes very different.» The challenge of vertical scaling (equating, linking): Is there a continuous developmental construct across grades ? The further is the linking between grades the more hazardous are the results. The challenge of alternate assessments. Neuchâtel / J.-G Blais january 2008
Neuchâtel / J.-G Blais january 2008 Standards G.Stone 1996-2004-2006 : Rasch model «Individual are not very good at establishing what examinees should know or be able to do.» Theoretical inconsistencies Standards should be about content not scores. «Traditional standards cannot be expressed qualitatively, confronting the validity of meaning and the validity of score.» Neuchâtel / J.-G Blais january 2008
Neuchâtel / J.-G Blais january 2008 Standards «(methods…) fail to meet goals of judge agreement and fail to produce reproducible standards.» «Judges are asked to perform a task that is too difficult and confusing.» Estimate the probability of a minimally competent person to do something successfully. «Standards defined by judge panels are inexorably connected to their normative experiences and are therefore wholly sample dependent.» Neuchâtel / J.-G Blais january 2008
Neuchâtel / J.-G Blais january 2008 Standards Blais 2004-2007 : Qualitative standards Qualitative standards are more intuitive but they overlap, like tectonic plaques in a way. Like in the real world of evaluation/assessment. Personal development is not linear and do not occur at the same rate for everyone. Yearly standards should overlap, but yearly programme content does not overlap a lot. There is no free lunch. Neuchâtel / J.-G Blais january 2008
Standards / Conclusion Relative standards, contextual standards; are they fixed for life ? How long will they stand in a world moving fast forward ? When do we have to review them ? Each year ? Every five years ? Neuchâtel / J.-G Blais january 2008
Standards / Conclusion Much of the controversy over standard setting in education is centered around disputes over what is or should be in the best interest of the public Neuchâtel / J.-G Blais january 2008
Standards / Conclusion «How much does the past shape the future» B. Mandelbrot Neuchâtel / J.-G Blais january 2008