Construct Validity of Classroom Observations: Items, Factors, Raters, and Achievement Lee Branum-Martin, Coleen D. Carlson, Angelia Durand, Christopher.

Slides:



Advertisements
Similar presentations
Progress Monitoring: Data to Instructional Decision-Making Frank Worrell Marley Watkins Tracey Hall Ministry of Education Trinidad and Tobago January,
Advertisements

Performance Assessment
Early Language and Literacy Classroom Observation ELLCO
Standardized Scales.
1 © 2006 Curriculum K-12 Directorate, NSW Department of Education and Training English K-6 Syllabus Using the syllabus for consistency of assessment.
Using Assessment to Inform Instruction: Small Group Time
A Conceptual Introduction to Multilevel Models as Structural Equations
TELPAS Grades K-1 Holistic Rating Training Spring 2010 Hitchcock ISD.
Evaluating the Reliability and Validity of the Family Conference OSCE Across Multiple Training Sites Jeffrey G. Chipman MD, Constance C. Schmitz PhD, Travis.
Reliability & Validity.  Limits all inferences that can be drawn from later tests  If reliable and valid scale, can have confidence in findings  If.
An Update from the Iowa Testing Programs Mississippi Bend AEA October 7, 2010 David Henkhaus.
MEASURING TEACHING PRACTICE Tony Milanowski & Allan Odden SMHC District Reform Network March 2009.
Supporting Teachers to make Overall Teacher Judgments The Consortium for Professional Learning.
Part II Knowing How to Assess Chapter 5 Minimizing Error p115 Review of Appl 644 – Measurement Theory – Reliability – Validity Assessment is broader term.
Ohhhhh, Christopher Robin, I am not the right one for this job... The House on Pooh Corner, A.A. Milne.
Educational Outcomes: The Role of Competencies and The Importance of Assessment.
Multilevel Modeling Soc 543 Fall Presentation overview What is multilevel modeling? Problems with not using multilevel models Benefits of using.
Principles of High Quality Assessment
Childcare Quality and Early Learning Gail E. Joseph, Ph. D
Chapter 9 Flashcards. measurement method that uses uniform procedures to collect, score, interpret, and report numerical results; usually has norms and.
Classroom Assessment A Practical Guide for Educators by Craig A
Rosnow, Beginning Behavioral Research, 5/e. Copyright 2005 by Prentice Hall Ch. 6: Reliability and Validity in Measurement and Research.
High Quality Kindergarten Programs 8/6/2015 Division of Early Childhood Education.
WRITING NEXT: A Report to Carnegie Corporation of New York
What should be the basis of
performance INDICATORs performance APPRAISAL RUBRIC
Brian Yusko Associate Dean of Academic Programs Subject and grade-level specific.
Domain II Creating and Environment for Learning
From Learning Goals to Assessment Plans University of Wisconsin Parkside January 20, 2012 Susan Hatfield Winona State University
CURRICULUM ALIGNMENT Debbi Hardy Curriculum Director Olympia School District.
An Introduction to HLM and SEM
Inferences about School Quality using opportunity to learn data: The effect of ignoring classrooms. Felipe Martinez CRESST/UCLA CCSSO Large Scale Assessment.
Classroom Assessments Checklists, Rating Scales, and Rubrics
CLASS Overview Partner Meeting March 24, CLASS is the: CLassroom Assessment Scoring System.
What Was Learned from a Second Year of Implementation IES Research Conference Washington, DC June 8, 2009 William Corrin, Senior Research Associate MDRC.
Blue Springs Elementary School Standards Based Report Card Parent Meeting.
Measuring Complex Achievement
Week 5 Lecture 4. Lecture’s objectives  Understand the principles of language assessment.  Use language assessment principles to evaluate existing tests.
Applying SGP to the STAR Assessments Daniel Bolt Dept of Educational Psychology University of Wisconsin, Madison.
Does Reading First Work? Feng-Yi Hung, Ph.D. Director of Assessment and Program Evaluation Clover Park School District.
Performance and Portfolio Assessment. Performance Assessment An assessment in which the teacher observes and makes a judgement about a student’s demonstration.
All Hands Meeting 2005 The Family of Reliability Coefficients Gregory G. Brown VASDHS/UCSD.
Leading (and Assessing) a Learning Intervention IMPACT Lunch and Learn Session August 6, 2014 Facilitated By Ozgur Ekmekci, EdD Interim Chair, Department.
Contextual Effects of Bilingual Programs on Beginning Reading Barbara R. Foorman, Lee Branum-Martin, David J. Francis, & Paras D. Mehta Florida Center.
G Lecture 7 Confirmatory Factor Analysis
DWW: Doing What Works Recommendation 1. Make data part of an ongoing cycle of instructional improvement. Recommendation 2. Teach students to examine their.
A Statistical Linkage Between NAEP and ECLS-K Grade Eight Reading Assessments Enis Dogan Burhan Ogut Young Yee Kim Sharyn Rosenberg NAEP Education Statistics.
Research Methodology and Methods of Social Inquiry Nov 8, 2011 Assessing Measurement Reliability & Validity.
Catholic College at Mandeville Assessment and Evaluation in Inclusive Settings Sessions 3 & /14/2015 Launcelot I. Brown Lisa Philip.
Methods and Tools for Measuring Fidelity Greg Roberts, PhD. Vaughn Gross Center & National Center for Instruction The University of Texas at Austin.
Highly Qualified Teachers, Passionate Learners, and Parental Involvement: A Formula for Student Success Mrs. Heather Dawn Luna EDCI Dr. A. Herrera.
Seeing myself interact: Understanding interactions with children by embedding the CLASS in professional development Marilyn Chu, WWU – ECE FOCUS on Children.
Part 2: Assisting Students Struggling with Reading: Multi-Tier System of Supports H325A
Updated Section 31a Information LITERACY, CAREER/COLLEGE READINESS, MTSS.
State Board of Education February 10, Update on EOC Reports: Assessment Survey Results Full-Day 4K, CDEP.
Standards That Count: Reading, Discussion, Writing, and Presentation.
Examining Student Work Middle School Math Teachers District SIP Day January 27, 2016.
Using the CLASS tool to Improve Instructional Practices in Early Childhood Tracie Dow and Felicia Owo.
BY MADELINE GELMETTI INCLUDING STUDENTS WITH DISABILITIES AND ENGLISH LEARNERS IN MEASURES OF EDUCATOR EFFECTIVENESS.
WestEd.org Washington Private Schools RtI Conference Follow- up Webinar October 16, 2012 Silvia DeRuvo Pam McCabe WestEd Center for Prevention and Early.
Classroom Assessments Checklists, Rating Scales, and Rubrics
Measuring Principals’ Effectiveness:
1University of Oklahoma 2Shaker Consulting
Reliability and validity of the BREQ-2 for measuring high school students’ motivation for physical education Stuart Forsyth¹, David Rowe¹, and Nanette.
Brotherson, S., Kranzler, B., & Zehnacker, G.
Classroom Assessments Checklists, Rating Scales, and Rubrics
The Relationship between Fidelity of Implementation and Classroom Quality in Early Childhood Education Katerina Sergi, M.A.1, Giorgio Carlo Cappello, Ph.D.1,
Instructional Practices in the Early Grades that Foster Language & Comprehension Development Timothy Shanahan University of Illinois at Chicago
Estimates and 95% CIs of between- and within-pair variations for SS and OS twin pairs and achievement test scores in mathematics and reading assessed in.
Validity and Reliability II: The Basics
Presentation transcript:

Construct Validity of Classroom Observations: Items, Factors, Raters, and Achievement Lee Branum-Martin, Coleen D. Carlson, Angelia Durand, Christopher Barr Texas Institute for Measurement, Evaluation, and Statistics University of Houston Society for Research on Educational Effectiveness March 4, 2010

= items + construct + rater + time + school + grade... Classroom Quality A Generalizability Theory Approach (following Raudenbush) Generalizability Raudenbush (2008). Statistical inference when classroom causality is measured with error. Context measurement Raudenbush & Sampson (1999). Ecometrics. Raudenbush, Rowan, & Kang (1991). A multilevel, multivariate model for school climate with estimation via the EM algorithm and application to US high school data.

An Ecometric Approach (in response to Raudenbush) Development Conversation Uses Vocabulary Oral Language Item score = construct + rater + time + school + grade... Furnishings Arrangement Engagement Organization

Rater Differences Development Conversation Uses Vocabulary Oral Language Furnishings Arrangement Engagement Organization Rater Ratings are valid, but differ in severity (factor means) Ratings are valid, but differ in factor variances Ψ 11 Ψ 22 Ψ 21 Raters differ in validity λ 11 λ 12 λ 13 λ 14 λ 25 λ 26 λ 27 Item score = construct + rater + time + school + grade...

Time Differences Development Conversation Uses Vocabulary Oral Language Furnishings Arrangement Engagement Organization Time Classroom factors differ over time The variances & relations among classroom factors differ over time Ψ 11 Ψ 22 Ψ 21 Factor composition differs over time: the nature of classroom ecology changes λ 11 λ 12 λ 13 λ 14 λ 25 λ 26 λ 27 (School year, semester, month, class session, segment of session) Item score = construct + rater + time + school + grade...

Sample & Design Grade Unique YearSemesterK123Total TeachersSchools Fall Spring Fall Spring Fall Spring Total1,0061,0431, ,0152,172158

Example Items for Classroom Management Understanding of rules 1.Children appear to have limited understanding of rules and routines. This is evident in the classroom as children engage frequently in conflicts and rarely in purposeful activity. 2.Children appear to understand regular rules and routines, but there is occasionally a need for teacher reminders or reinforcement about some rules and routines. 3.Children appear to have internalized regular rules and routines. This is evident as children move through the classroom period smoothly, with few conflicts, and are most often seen engaged in purposeful activity. Communication of expectations 1.Expectations for children's behavior may be confusing or inconsistent, conflicts may be inconsistently resolved. 2.Expectations for children's behavior are communicated from teacher to children. 3.Clear expectations for children's behavior are consistently communicated in multiple ways. adapted from the ELLCO (Smith et al., 2002)

Fit Statistics: Grades, Years, Semesters StepModel  dfCFITLIRMSEAWRMR Fall Spring Fall Spring Fall * Spring Fall across years, grades Spring across years, grades Scoring all waves

Model Results: Factor Loadings & Thresholds Organization Reading Instruction Writing Instruction Factor Item Factor z-score Threshold between high and medium quality Threshold between medium and low quality Factor loading

Model Results: Factor Loadings & Thresholds Assessment Curriculum Oral Language Management Climate FactorItem — Factor z-score medium/high low/medium

Factor Correlation Matrix FactorAssessClimateCurric.Man.OralOrg.Read.Writ. Assessment1.00 Climate Curriculum Management Oral Language Organization Reading Writing

Interobserver Agreement on Factor Scores ObserverComparisonpTeacherObserver FactorrmeanSDmeanSD(diff)ICC Assessment Climate Management Oral Language Organization Reading Writing Interobserver correlation Mean differencesCross-classified

Correlations to campus mean achievement Reading Instruction Oral Language Instruction Classroom Management TestGradeYearSchoolsFallSpringFallSpringFallSpring ITBS TAKS * * ** ** * *** *** * * * * * * ** * p <.05

Conclusions Confirmatory factor models can be applied to observational data to examine relations among items, constructs, raters, time, and sites CFA on categorical items reveals functional information about the items. CFA incorporates theory and design in a falsifiable way. CFA is complex, but can serve as a first-stage validity analysis to be exported into other analyses, such as G- theory or multilevel models of outcomes.

Questions, Comments? References Raudenbush, S. W. (2008). Statistical inference when classroom causality is measured with error. Paper presented at the SREE. Raudenbush, S. W., Rowan, B., & Kang, S. J. (1991). A multilevel, multivariate model for school climate with estimation via the EM algorithm and application to US high school data. J Ed Stat, 16, Raudenbush, S. W., & Sampson, R. J. (1999). Ecometrics: Toward a science of assessing ecological settings, with application to the systematic social observation of neighborhoods. Soc Methodology, 29(1), Smith, M. W., Dickinson, D. K., Sangeorge, A., & Anastasopoulos, L. (2002). Early Language & Literacy Classroom Observation (ELLCO) Toolkit, Research Edition. Baltimore, MD: Paul H. Brookes.