An Evaluation of an Observation Rubric Used to Assess Teacher Performance Kent Sabo Kerry Lawton Hongxia Fu Arizona State University.

Slides:

Advertisements

Similar presentations

Project VIABLE: Behavioral Specificity and Wording Impact on DBR Accuracy Teresa J. LeBel 1, Amy M. Briesch 1, Stephen P. Kilgus 1, T. Chris Riley-Tillman.

Advertisements

NIET Teacher Evaluation Process

C OLLABORATIVE A SSESSMENT S YSTEM FOR T EACHERS CAST

The Research Consumer Evaluates Measurement Reliability and Validity

What is a Good Test Validity: Does test measure what it is supposed to measure? Reliability: Are the results consistent? Objectivity: Can two or more.

General Information --- What is the purpose of the test? For what population is the designed? Is this population relevant to the people who will take your.

August 2014 The Oregon Matrix Model was submitted to USED on May 1, 2014 and is pending approval* as of 8/8/14 *Please note content may change Oregon’s.

PROPOSED MULTIPLE MEASURES FOR TEACHER EFFECTIVENESS

MEASURING TEACHING PRACTICE Tony Milanowski & Allan Odden SMHC District Reform Network March 2009.

Chapter 4 Validity.

Factor Analysis Ulf H. Olsson Professor of Statistics.

Common Factor Analysis “World View” of PC vs. CF Choosing between PC and CF PAF -- most common kind of CF Communality & Communality Estimation Common Factor.

When Measurement Models and Factor Models Conflict: Maximizing Internal Consistency James M. Graham, Ph.D. Western Washington University ABSTRACT: The.

Education 795 Class Notes Factor Analysis II Note set 7.

Correlational Designs

Classroom Assessment A Practical Guide for Educators by Craig A

Understanding Validity for Teachers

Differentiated Supervision

Multivariate Methods EPSY 5245 Michael C. Rodriguez.

Factor Analysis Psy 524 Ainsworth.

Copyright © 2001 by The Psychological Corporation 1 The Academic Competence Evaluation Scales (ACES) Rating scale technology for identifying students with.

Ch 6 Validity of Instrument

CLASS Keys Orientation Douglas County School System August /17/20151.

Stronge Teacher Effectiveness Performance Evaluation System

Classroom Assessment A Practical Guide for Educators by Craig A

 In Cluster, all teachers will write a clear goal for their IGP (Reflective Journal) that is aligned to the cluster and school goal.

L 1 Chapter 12 Correlational Designs EDUC 640 Dr. William M. Bauer.

1 Issues in Assessment in Higher Education: Science Higher Education Forum on Scientific Competencies Medellin-Colombia Nov 2-4, 2005 Dr Hans Wagemaker.

A Framework of Mathematics Inductive Reasoning Reporter: Lee Chun-Yi Advisor: Chen Ming-Puu Christou, C., & Papageorgiou, E. (2007). A framework of mathematics.

Measuring Complex Achievement

Teacher Effectiveness Pilot II Presented by PDE. Project Development - Goal  To develop a teacher effectiveness model that will reform the way we evaluate.

THE DANIELSON FRAMEWORK. LEARNING TARGET I will be be able to identify to others the value of the classroom teacher, the Domains of the Danielson framework.

Factor validation of the Consideration of Future Consequences Scale: An Assessment and Review Tom R. EikebrokkEllen K. NyhusUniversity of Agder.

{ Principal Leadership Evaluation. The VAL-ED Vision… The construction of valid, reliable, unbiased, accurate, and useful reporting of results Summative.

Construct Validity of the Battery of Developmental Assessment (BDA): A Model Tool for Lebanon Huda Husseini Bibi (Ed.D) Lebanese International University.

Factor Analysis ( 因素分析 ) Kaiping Grace Yao National Taiwan University

Measurement Models: Exploratory and Confirmatory Factor Analysis James G. Anderson, Ph.D. Purdue University.

Reliability Analysis Based on the results of the PAF, a reliability analysis was run on the 16 items retained in the Task Value subscale. The Cronbach’s.

Validity Validity is an overall evaluation that supports the intended interpretations, use, in consequences of the obtained scores. (McMillan 17)

Validity and Item Analysis Chapter 4. Validity Concerns what the instrument measures and how well it does that task Not something an instrument has or.

Multivariate Analysis and Data Reduction. Multivariate Analysis Multivariate analysis tries to find patterns and relationships among multiple dependent.

Candidate Assessment of Performance CAP The Evidence Binder.

Foundations of American Education: Perspectives on Education in a Changing World, 15e © 2011 Pearson Education, Inc. All rights reserved. Chapter 11 Standards,

A TAP Story: A. A. Nelson Elementary School Jacqueline Smith, Principal A.A. Nelson Elementary School TAP Leadership Team Teddy Broussard, State TAP Director.

North Carolina Educator Evaluation System Jessica Garner

September 2, 2009 Blakemore Cluster Meeting. Meeting Objectives and Agenda By the end of cluster, teachers will have developed an understanding of the.

UPDATE ON EDUCATOR EVALUATIONS IN MICHIGAN Directors and Representatives of Teacher Education Programs April 22, 2016.

Assistant Instructor Nian K. Ghafoor Feb Definition of Proposal Proposal is a plan for master’s thesis or doctoral dissertation which provides the.

Project VIABLE - Direct Behavior Rating: Evaluating Behaviors with Positive and Negative Definitions Rose Jaffery 1, Albee T. Ongusco 3, Amy M. Briesch.

Recertification TEAM Teacher Evaluation Process.

CAEP Standard 4 Program Impact Case Study

EVALUATING EPP-CREATED ASSESSMENTS

Writing a sound proposal

1University of Oklahoma 2Shaker Consulting

Classroom Assessment A Practical Guide for Educators by Craig A

An Introduction to Teacher Evaluation

Shudong Wang NWEA Liru Zhang Delaware Department of Education

Chapter 2 Sociological Research Methods

What to Look for Mathematics Grade 6

Week 3 Class Discussion.

Reliability and Validity of Measurement

Dr. Chin and Dr. Nettelhorst Winter 2018

Teacher Effectiveness and Support for Growth

EPSY 5245 EPSY 5245 Michael C. Rodriguez

Confirmatory Factor Analysis

jot down your thoughts re:

AACC Mini Conference June 8-9, 2011

jot down your thoughts re:

Causal Comparative Research Design

Presentation transcript:

An Evaluation of an Observation Rubric Used to Assess Teacher Performance Kent Sabo Kerry Lawton Hongxia Fu Arizona State University

Introduction School reform in Arizona –TAP reform model –Revamped teacher evaluation & compensation systems –Teacher evaluation includes Calculation of teacher “value-added” on achievement. Qualitative measure of teacher behavior and practice

TAP Rubric Developed in 2001 by TAP (now NIET) Teacher effectiveness framework includes four domains. –Instruction (D1) –Designing & Planning Instruction (D2) –Learning Environment (D3) –Teacher Responsibilities Rubric includes 19 items divided across first three domains.

D1: Instruction 1. Standards & Objectives7. Academic Feedback 2. Motivating Students8. Grouping Students 3. Present Instructional Content9. Teacher Content Knowledge 4. Lesson Structure & Pacing10. Teacher Know of Students 5. Activities & Materials11. Thinking 6. Questioning12. Problem Solving

D2: Designing & Planning Instruction 1.Instructional Plans 2.Student Work 3.Assessment

D3: The Learning Environment 1.Expectations 2.Managing Student Behavior 3.Environment 4.Respectful Culture

Rubric Scoring Between 4-6 observations per school year Multiple observers –Administrators & experienced teachers are observers –Trained & certified to score rubric by NIET

Rubric Scoring Items are scored 1-5 –Behavioral definitions at 1, 3, 5 Behaviors, characteristics, artifacts of teacher performance –3= “Proficient” Scores are averaged & weighted by both domain & observer type –For a single observation, item scores are averaged into subscale scores. An overall score is also averaged & entered into equation for performance awards

Purpose of Current Study Establishing validity is crucial when an instrument is used in making high-stakes decisions (Messick, 1995). To date, most studies have focused on establishing evidence for test-retest reliability & relationship with value-added achievements. This study investigated the proposed latent variable structure of the rubric.

Methods Step 1. Conduct CFA on 2 nd -order factor model –Not explicitly defined by TAP but implicit in TAP scoring system (averaging) Step 2. If CFA suggests miss-specification, conduct EFA on 19 indicators Sample: 1497 teacher observation scores –Across 53 public schools in Arizona.

2 nd -order Factor Model

Methods 2 nd - order model is just-identified. –Impossible to interpret fit statistics –Must make decision on fit by analyzing nested models—both more and less restrictive (Rindskoph & Rose, 1988). Additional models examined: –Bi-factor (least restrictive) –Correlated group factor –One-factor (most restrictive)

Bi-factor Model

Correlated Group Factor Model

One-factor Model

CFA Results Results for all models suggest misspecification. –Bi-factor: CFI =.96; RMSEA =.05; SRMR =.02; 8 of 12 neg. factor loadings (Instruction); 18 sig. mod. indices –Group Factor: CFI =.93; RMSEA =.08; SRMR =.04; High factor corr. ( ); 29 sig. mod. indices –One Factor: CFI =.90; RMSEA =.09; SRMR =.04; 79 sig. mod. indices

Step 2: Method Exploratory factor analysis on 19 indicators –Extraction: Principal axis factoring –Factor retention: Parallel analysis –Rotation: Promax Sample: 1497 teachers (observation scores)

Results of PAF Extraction (Factor Pattern Matrix—Rotated) Factors Item123 Instructional Plans Student Work Assessment Expectations Managing Student Behavior Environment Respectful Culture St&ards & Objectives Motivating Students Presenting Instruct. Content Lesson Structure & Pacing Activities & Materials Questioning Academic Feedback Grouping Students Teacher Content Knowledge Teacher Knowledge of Students Thinking Problem Solving

PAF Extraction (Rotated): Factor 1 Instructional Plans.780Lesson Structure & Pacing.616 Student Work.728Activities & Materials.692 Assessment.796Questioning.511 Expectations.442Academic Feedback.507 Managing Student Behavior.004Grouping Students.481 Environment-.005Teacher Content Knowledge.714 Respectful Culture-.059Teacher Knowledge of Students.500 Standards & Objectives.835Thinking.060 Motivating Students.459Problem Solving.027 Presenting Instruct. Content.772

PAF Extraction (Rotated): Factor 2 Instructional Plans-.013Lesson Structure & Pacing.264 Student Work-.027Activities & Materials.018 Assessment-.113Questioning.076 Expectations.499Academic Feedback.167 Managing Student Behavior.869Grouping Students.284 Environment.706Teacher Content Knowledge.054 Respectful Culture.898Teacher Knowledge of Students.311 St&ards & Objectives-.008Thinking.032 Motivating Students.278Problem Solving.012 Presenting Instruct. Content.091

PAF Extraction (Rotated): Factor 3 Instructional Plans.034Lesson Structure & Pacing-.095 Student Work.132Activities & Materials.110 Assessment.076Questioning.222 Expectations-.068Academic Feedback.111 Managing Student Behavior-.055Grouping Students.019 Environment.180Teacher Content Knowledge.030 Respectful Culture.043Teacher Knowledge of Students-.044 St&ards & Objectives-.041Thinking.830 Motivating Students.109Problem Solving.856 Presenting Instruct. Content-.054

EFA Results Currently ProposedOur Results 1. Instruction (12 items)1. Instructional Effectiveness (13 items) 2. Learning Environment (4 items) 3. Designing & Planning Instruction (3 items)3. Thinking/Problem Solving (2 items) Extracted 3 factors—same number as proposed Identical item loadings on “Learning Environment” factor Different item loadings for other factors ‒ Our “Instructional Effectiveness” factor included all items included in proposed “Instruction and Designing & Planning Instruction” ‒ The two items in out “Thinking/Problem Solving are in the “Instruction” factor in the proposed model.

Discussion This study did not add evidence that scores from the TAP rubric (as currently scored) can be used to make global inferences regarding teacher quality. –The original structure proposed by the rubric developers was not retained. –Differences in the number of items per factor is problematic when composite is simple average. Thinking and Problem-solving factor may be problematic –Scoring these items requires observers to infer across multiple time points –Appropriate within a teacher evaluation assessment? Teacher influenced or student-related items? Rubric may provide useful information for formative assessment and evaluation –Items have “face validity.” –Examine individual items rather than composites to focus improvement

Limitations and Cautions Factor analysis is only one method through which to create evidence for validity. –Further research should attempt to correlate rubric scores with other positive academic outcomes (e.g., graduation; pro-academic student behaviors) FA results are based on a specific dataset –Our results should not be extended beyond the data used and the population assessed. Ideas for future research: –Estimate CFA models from averaged & weighted scores. –Examine measurement invariance across grade level, subject taught & teacher type. –Examine latent growth patterns within and across several school years.