1 Innovations in assessment and evaluation 1 Martin Valcke
2 Structure Advance organizer: evidence-base for evaluation and assessment Activity 1: What is assessment and evaluation? Trends assessment & evaluation Self and peer assessment Activity 2: Develop a self-assessment activity Rubrics Activity 3: Develop a rubric
3 Conclusions Consider differences between: measurement, evaluating and scoring Reconsider the responsibilities for assessment and evaluation Develop reflective competencies of staff and students
4 Advance organizer Where is the evidence about assessment and evaluation?
5 The importance of assessment & evaluation
6 6
7 Activity 1: What is assessment & evaluation? Write down elements that define what assessment and/or evalution comprise.
8 What is assessment & evaluation? Three elements: ‣ Measure: get data from student about his/her answer or behavior (test, exam, task) ‣ Evaluate: what is the value of the “answer/behavior”? ‣ Score: what score will be attributed to a quality level in the answer or behavior?
9 Question What responsibility would you hand over to students? ‣ Measure: get data from student about his/her answer or behavior (test, exam, task) ‣ Evaluate: what is the value of the “answer/behavior”? ‣ Score: what score will be attributed to a quality level in the answer or behavior?
10 Trends in Assessment & Evaluation Major changes in the place, role, function, focus, approach, tools, … in higher education Among trends: major shift in WHO is responsible for the evaluation and in the TOOLS being used
11 Trends in assessment & evaluation Shared characteristics: ‣ Focus on “behavior” ‣ Focus on “authentic” behavior ‣ Foicus on “complex” behavior ‣ Explicit “criteria” ‣ Explicit “standards” ‣ Need for concrete feedback ‣ Focus on “consequential validity” Gielen, Dochy & Dierick (2003)
12 Trends in Assessment & Evaluation Trends according to Fant et al. (1985, 2000) : Assessment centres Self and Peer assessment Portfolio assessment Logbooks Rubrics
13 Trend 1 Self and Peer assessment Trends according to Fant et al. (1985, 2000) : Assessment centres Self and Peer assessment Portfolio assessment Logbooks Rubrics
14 Individual learner Group learner External institution Teachers Expert teacher Assessment system Institutional level
15 Definition self assessment Self assessment can be defined as “the evaluation or judgment of ‘the worth’ of one’s performance and the identification of one’s strengths and weaknesses with a view to improving one’s learning outcomes” (Klenowski, 1995, p. 146). 15
16 Definition peer assessment Peer assessment can be defined as “an arrangement in which individuals consider the amount, level, value, worth, quality, or success of the products or outcomes of learning of peers of similar status” (Topping, 1998, p. 250). 16
17 Self- and peer assessment Learn about your own learning process. Schmitz (1994): “assessment-as- learning”. ~ self corrective feedback 17
18 See experiential learning cycle of Kolb. Boekaerts (1991) self evaluation as a competency. Development of metacognitive knowledge and skills (see Brown, Bull & Pendlebury, 1998, p.181). Freeman & Lewis (1998, p.56-59): developing pro-active learner s 18
19 The Learning Cycle Model
Self – and Peer Assessment in Medical education: Some studies
Accuracy 21
22
23
Tool for self assessment 24
25
26
Attitudes 27
28
29
Attitudes 2 30
31
32
Reliability 33
34
35
Accuracy 2 36
37
38
Confidence / performance 39
40
41
Accuracy 3 42
43
44
Assessment ability 45
46
47
Longitudinal 48
49
50
Follow up 51
52
53
Review: Accuracy 54
55
56
57
Review Effectiveness 58
59
60
61
PA enhances performance 62
63
64
PA longitudinal stability 65
66
67
PA & rater selection 68
69
70
PA, formative & multiple observations 71
72
73
PA, hard to generalize 74
75
76
77 Is it possible? 77 Group evaluations tend to fluctuate around the mean
78 Learning to evaluate Develop checklists Give criteria Ask to look for quality indicators. Analysis of examples good and less good practices: develop a quality “nose” 78
79 Learning to evaluate Freeman & Lewis (1998, p.127) : ‧ Learner develops list of criteria. ‧ Pairs of learners compare listed criteria. ‧ Pairs develop a criterion checklist. ‧ Individual application of checklist. ‧ Use of checklist to evalute work of other learner. ‧ Individual reworks his/her work. ‧ Final result checkeed by teacher and result compared to learner evaluation. ‧ Pairs recheck their work on the base of teacher feedback. 79
80 Learning to evaluate Peer evaluation is not the same as Peer grading Final score is given by teacher! Part of score could build on accuracy of self/peer evaluation and self-correction Example: 1st year course Instructional Sciences 80
81
82
83
84
85
86
87 Information processing 87
Activity 2: develop a self assessment exercise Develop the basic instructions for a self- assessment exercise. 88
Importance of Feedback Where am I going? feed up How am I going? feed back Where to next? feed forward (H attie & Timperly, 2007) 89
90 Trend 2 Rubrics Trends according to Fant et al. (1985, 2000) : Assessment centres Self and Peer assessment Portfolio assessment Logbooks Rubrics
91
92
93
94
95 Rubrics
96 Rubrics
97
98 Rubrics Rubric: scoring tool for a qualitative assessment of complex authentic activoity. ‣ A rubric builds on criteria that are enriched with a scale that help to detremine mastery levels ‣ For each mastery level, standards are available. ‣ A rubric helps both the staff and the student in view of what is expected at process/product level. ‣ Rubrics for “high stake assessment” and for “formative assessment” (in view of learning). (Arter & McTighe, 2001; Busching, 1998; Perlman, 2003). Rubrics focus on the relationship between competencies- criteria, and indicators and are organized along mastery levels (Morgan, 1999).
99 Rubrics Holistic – Analytic Taak specific - Generic
100 Assumptions about rubrics Larger consistency in scores (reliability). More valid assessment of complex behavior. Positive impact on subsequent learning activity.
101 Performance assessment Rubrics focus on the relationship between competencies-criteria, and indicators and are organized along mastery levels (Morgan, 1999).
102 Doubts? Approach marred by beliefs of staff/students about evaluation (see Chong, Wong, & Lang, 2004); Joram & Gabriele, 1998) Validity criteria and indicators (Linn, 1990), Reliability when used by different evaluators (Flowers & Hancock, 2003).
103 Activity 3: develop a rubric
104 Research rubrics Review article 75 studies m.b.t. rubrics : Jonsson, A., & Svingby, G. (2007). The use of scoring rubrics: Reliability, validity and educational consequences. Educational Research Review, 2, 130– 144. ‣ (1) the reliable scoring of performance assessments can be enhanced by the use of rubrics, especially if they are analytic, topic-specific, and complemented with exemplars and/or rater training; ‣ (2) rubrics do not facilitate valid judgment of performance assessments per se. However, valid assessment could be facilitated by using a more comprehensive framework of validity; ‣ (3) rubrics seem to have the potential of promoting learning and/or improve instruction. The main reason for this potential lies in the fact that rubrics make expectations and criteria explicit, which also facilitates feedback and self-assessment.
105 Conditions effective usage Develop assessment frame of reference Training in usage Interrater usage
106 Development rubric Choose criteria for expected behavior ‣ 4 to 15 statements describing criterion Detremine bandwith quality differences ‣ E.g. 0 to 5 qualitative levels Describe eachj value in quality level ‣ Concrete observable qualifications
107
108
109 Critical thinking rubric
110 Informative websites Overview tools, examples, theory, background, research: Critical thinking rubrics: g.html Rubric generators: Intro on interesting rubric sites: Rubric APA research paper: General intro and overview:
111 Conclusions Consider differences between: measurement, evaluating and scoring Reconsider the responsibilities for assessment and evaluation Develop reflective competencies of staff and students
112 Innovations in assessment and evaluation 112 Martin Valcke