Download presentation
Presentation is loading. Please wait.
Published byCaitlin Carr Modified over 6 years ago
1
*Note to facilitator—prior to conducting this learning session, it is strongly advised that you have watched and scripted the video less and thoroughly reviewed the ratings assigned by TN raters in order to prompt and promote learning. Guidance for selecting a video can be found in the package guidance document. Say: Welcome to today’s professional learning session on inter-rater reliability. Inter-rater reliability is essential to the teacher evaluation process to eliminate biases and sustain transparency, consistency, and impartiality. This work will not only deepen our own understanding of TEAM, but also improve teacher perception of our local implementation of teacher evaluation. *Note to facilitator—survey responses from the Tennessee Educator Survey might be shared to reinforce the need of this module. See TM_E5.d on the Tennessee Educator Survey at *Note to facilitator—if completing all modules of the professional learning package, Say: Over the next few professional learning sessions we will Build our understanding of performance levels, indicators, and descriptors Calibrate our ratings Calibrate our feedback with a structured framework Consider the impact of this work on your own evaluation (C1 on the TEAM Administrator rubric). We will be walking through this session as learners—active participants—so your engagement will be key for allowing you to construct new learning. *Note to facilitator—this session is designed to allow the learner to construct his/her own learning, so there will be several points of self-assessment and reflection vs. telling/instructing. You will be facilitating learning, not giving a lecture. TEAM Professional Learning Package Inter-Rater Reliability Module 1: Accurate Ratings Learning Session 2: Calibration of Ratings
2
Module 1: Accurate Ratings
Learning Session 1: Understanding Performance Levels, Indicators, and Descriptors Learning Session 2: Calibration of Ratings Say: In learning session 1, we revisiting the performance levels, indicators, and descriptors to deepen understanding of instructional practice and how it should be rated. In this learning session, we will apply that learning as we engage in a calibration activity. Calibrating is the process of using a single set of rating criteria among multiple raters. If calibrating is successful, a particular body of evidence should receive the same rating regardless of who rated it. Without it, classroom observations will not yield reliable data. ”
3
Guiding Questions for Learning Session 2
How can regular calibration at the district and/or school level eliminate biases and sustain transparency, consistency, and impartiality in my practices? How will improved inter-rater reliability in ratings gained through calibration benefit my school and district? Say: Take a moment and read our guiding questions to yourself. [Pause for reading time.] Which of these is of greater interest to you? Take 90 seconds and share both your selection and why you chose it with a shoulder partner. *Note to facilitator—group share out is not necessary.
4
Come to group consensus and chart. Watch lesson and collect evidence.
Calibration of Scores Debrief for accuracy. Come to group consensus and chart. Score with partner. Watch lesson and collect evidence. *Note to facilitator: This is the process for calibration. You may have pre-assigned the video and set the expectation the lesson was scripted, that ratings were assigned to all or some indicators, and a rational was provided for the ratings or you may do that activity as a part of this learning session.
5
Calibration Process Script and collect evidence.
In pairs or small groups, review and consider the evidence collected in terms of each descriptor. Use the handbook to help guide thinking. Using the performance level guide, pairs/small groups come to consensus and assign a rating on the consensus chart for each indicator being calibrated. Whole group discusses any indicators differing by more than one performance level or outliers to come to consensus. TN raters scores should be added to the consensus chart and discussed whole group as needed. *Note to facilitator: animated slide—reveal instructions as needed. The first bullet may be deleted if the lesson was pre-scripted and scored. If scripting, allow time for everyone to prepare for lesson scripting. *Note to facilitators-have the TN raters’ scores/evidence ready for the debrief. Say: We will be watching and collecting evidence on [insert information on video selection]. Prepare to script evidence from the lesson. Or You have watched and collected evidence on [insert information on video selection]. Please gather that data from the lesson. Play video if applicable. *Note to facilitator—Assign participants to pairs or small table groups based on the size of the audience. Pre-create consensus chart with columns for each pair or group’s scores, including a blank column for the TN raters’ scores (if using a pre-assigned video), and rows for each indicator. Pictures illustrating steps one, two, and three are included in the Calibration Chart file. Do not add TN raters’ scores until all groups have charted all scores. Say: With your shoulder partner/in your small group, review evidence and assign a rating for [indicators being calibrated]. Refer to the TEAM rubric and handbook to support your thinking. Discuss the evidence together and come to a consensus on a rating for each indicator. Remember that in reaching consensus, it is critical that all opinions, ideas and concerns are taken into account. Through listening closely to each other, your goal is to determine a rating that is agreeable to everyone. When each pair/small group has completed rating,, add them to the chart I have displayed [insert location]. We will take [time in minutes] to do this. Have a representative to record ratings for each indicator. Be prepare to share the evidence that led you to your decision. *Note to facilitator—Provide each pair or table group with a different colored marker for charting their consensus scores. By giving each group has a different colored marker, you will be able to track group responses. This will make it easier for you to identify any groups that consistently fall into an outlier scoring range. This will inform any extra support that might be needed. *Note to facilitator—by charting only group consensus scores, you are decreasing the risk for any individual observer. This allows for stress-free changes to scores and allows the learner to change his or her own thinking privately. *Note to facilitator—Whole Group Discusses: ideally, most scores will be numerically close, but you may see some wide variation. Look for outlier indicator scores that are distinctly different from all other indicators as well as any indicator in which one or more groups are two or more away from the TN rater’s scores. Remember, you have not yet added the TN raters’ scores. Groups should work to come to consensus by sharing evidence. Note to facilitator—After final group consensus has been established, write the Tennessee Raters’ scores on the chart. If the scores are +/- one from the TN raters’ scores, no rater evidence reveal is needed. If they are +/- 2 or more from the TN raters’ scores, read/discuss the Tennessee raters’ evidence. Rater evidence documents are provided in the Videos, Evidence, Ratings file along with the corresponding videos. Read the appropriate parts to the group and encourage discussion around evidence to debrief the observers’ thinking. Consider circling any scores that are +/- 2 or more away from the TN Raters’ scores. *Note to facilitator—Results of this calibration will assist you in assessing the needs of the large group and or small groups. Plans for additional PD around TEAM should be made accordingly. Examples for identifying outlier scores: TN raters’ score an indicator at 2. The five groups score 1, 2, 3, 3, 1 on a given indicator. No group is more than one point higher or lower than the TN raters. No reveal is needed. TN raters’ score an indicator at 3. The five groups score 2, 4, 3, 1, 5 on a given indicator. One group is two points higher and one group is two points lower than the TN raters. Evidence must be shared and debriefed.
6
Debrief: Calibration of Scores
What takeaways might you be having about the impact including multiple observers during live observations? About which of your own evaluation practices might you have most changed your thinking? How might this new understanding change your approach to observation? Say: Read the questions on the screen [pause]. Take two minutes and share your thoughts with your shoulder partner. [After two minutes have passed…] Say: Who might like to share how your thoughts around scoring may have changed? *Note to facilitator—accept all answers and encourage responses. Do not agree or disagree, simply accept. Consider simply saying “thank you” to each response. *Note to facilitator—if you identify a pattern in the answers, you may wish to call that out at the conclusion of the group share.
7
Learning to Application
Say: This learning session closes with an activity titled, “Learning to Application.” It is designed to help everyone build on the knowledge they have developed during this learning session and prepare for the next one.”
8
Learning to Application
Prior to our next learning session: Conduct a co-observation with someone here today to collect evidence. Independently assign ratings to the indicators being rated and prepare feedback for the post-conference. Share ratings and come to consensus on any indicators where ratings differ by more than one performance level. Discuss similarities and differences in the prepared feedback. *Note to facilitator—Read the next steps on the slide. Provide a timeline and request feedback from the learning to application activity as needed. Module 2 centers on strong feedback as a part of post conferences, so this activity will prepare them for that learning session.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.