Generalizability and Dependability of Direct Behavior Ratings (DBRs) to Assess Social Behavior of Preschoolers Sandra M. Chafouleas 1, Theodore J. Christ.

Slides:

Advertisements

Similar presentations

Project VIABLE: Behavioral Specificity and Wording Impact on DBR Accuracy Teresa J. LeBel 1, Amy M. Briesch 1, Stephen P. Kilgus 1, T. Chris Riley-Tillman.

Advertisements

Measurement Concepts Operational Definition: is the definition of a variable in terms of the actual procedures used by the researcher to measure and/or.

PAYS FOR: Literacy Coach, Power Hour Aides, LTM's, Literacy Trainings, Kindergarten Teacher Training, Materials.

LIST QUESTIONS – COMPARISONS BETWEEN MODES AND WAVES Making Connections is a study of ten disadvantaged US urban communities, funded by the Annie E. Casey.

Direct Behavior Rating: An Assessment and Intervention Tool for Improving Student Engagement Class-wide Rose Jaffery, Lindsay M. Fallon, Sandra M. Chafouleas,

RELIABILITY consistency or reproducibility of a test score (or measurement)

In the name of Allah. Development and psychometric Testing of a new Instrument to Measure Affecting Factors on Women’s Behaviors to Breast Cancer Prevention:

Chapter 9 Flashcards. measurement method that uses uniform procedures to collect, score, interpret, and report numerical results; usually has norms and.

Chapter 7 Correlational Research Gay, Mills, and Airasian

Validity Lecture Overview Overview of the concept Different types of validity Threats to validity and strategies for handling them Examples of validity.

Partnering with parents

Study announcement if you are interested!. Questions  Is there one type of mixed design that is more common than the other types?  Even though there.

1 CHAPTER M4 Cost Behavior © 2007 Pearson Custom Publishing.

Service Delivery Models and Inclusive Practices in Speech-Language Pathology: Challenges and Solutions Connecticut Speech-Language-Hearing Association.

Copyright © 2011 Pearson Education, Inc. All rights reserved. Doing Research in Behavior Modification Chapter 22.

An Update on Florida’s Charter Schools Program Grant: CAPES External Evaluation 2014 Florida Charter Schools Conference: Sharing Responsibility November.

The Learning Behaviors Scale

A Project GATORSS: A comparison of perceived functions in naturalistic observations and functions identified via functional analysis Elizabeth L.W. McKenney,

Recent public laws such as Individuals with Disabilities Education Improvement Act (IDEIA, 2004) and No Child Left Behind Act (NCLB,2002) aim to establish.

T HE I NTERGENERATIONAL O BSERVATION S CALE : P ROCESS, P ROCEDURES, AND O UTCOMES Background Shannon Jarrott, Ph.D., Cynthia L. Smith, Ph.D., & Aaron.

A Project GATORSS: Social Skills Assessment and Intervention for Young Children with Autism Maureen A. Conroy, Ph.D., Crystal N. Ladwig, Ph.D., Brian A.

Student Engagement Survey Results and Analysis June 2011.

1 / 27 California Educational Research Association 88 th Annual Conference Formative Assessment: Implications for Student Learning San Francisco, CA November.

Assessment with Children Chapter 1. Overview of Assessment with Children Multiple Informants – Child, parents, other family, teachers – Necessary for.

Standardization and Test Development Nisrin Alqatarneh MSc. Occupational therapy.

Program Evaluation. Program evaluation Methodological techniques of the social sciences social policy public welfare administration.

Specific Learning Disability: Accurate, Defensible, & Compliant Identification Mississippi Department of Education.

Introduction Neuropsychological Symptoms Scale The Neuropsychological Symptoms Scale (NSS; Dean, 2010) was designed for use in the clinical interview to.

Classroom Assessments Checklists, Rating Scales, and Rubrics

The Impact of Training on the Accuracy of Teacher-Completed Direct Behavior Ratings (DBRs) Teresa J. LeBel, Stephen P. Kilgus, Amy M. Briesch, & Sandra.

Behavior Management: Applications for Teachers (5 th Ed.) Thomas J. Zirpoli Copyright © 2008 by Pearson Education, Inc. All rights reserved. 1 CHAPTER.

Miller Function & Participation Scales (M-FUN)

Major Types of Quantitative Studies Descriptive research –Correlational research –Evaluative –Meta Analysis Causal-comparative research Experimental Research.

Assessment in Education Patricia O’Sullivan Office of Educational Development UAMS.

Using handheld computers to support the collection and use of reading assessment data Naomi Hupert.

INTRODUCTION Project VIABLERESULTSRESULTS CONTACTS This study represents one of of several investigations initiated under Project VIABLE. Through Project.

Project VIABLE: Overview of Directions Related to Training to Enhance Adequacy of Data Obtained through Direct Behavior Rating (DBR) Sandra M. Chafouleas.

+ Development and Validation of Progress Monitoring Tools for Social Behavior: Lessons from Project VIABLE Sandra M. Chafouleas, Project Director Presented.

Training Interventionists to Implement a Brief Experimental Analysis of Reading Protocol to Elementary Students: An Evaluation of Three Training Packages.

Evaluating Impacts of MSP Grants Hilary Rhodes, PhD Ellen Bobronnikov February 22, 2010 Common Issues and Recommendations.

Training Individuals to Implement a Brief Experimental Analysis of Oral Reading Fluency Amber Zank, M.S.E & Michael Axelrod, Ph.D. Human Development Center.

Simple and Efficient Strategies for Collecting Behavioral Data in the Classroom Environment.

Early Childhood Special Education Part B, Section 619 Measurement of Preschool Outcomes-SPP Indicator #7 Training Sessions-2010.

Printed by The Aftercare and School Observation System: Characteristics of out-of-home contexts and young children’s behavior problems.

◦ 1, th and 11 th grade high school students (53% girls) ◦ 58% Caucasian; 23% African-American; 12% Hispanic ◦ Mean age = (SD=.68); age range.

Evaluating Impacts of MSP Grants Ellen Bobronnikov Hilary Rhodes January 11, 2010 Common Issues and Recommendations.

Reliability performance on language tests is also affected by factors other than communicative language ability. (1) test method facets They are systematic.

Evaluation Requirements for MSP and Characteristics of Designs to Estimate Impacts with Confidence Ellen Bobronnikov February 16, 2011.

RtI Response to Instruction and Intervention Understanding RtI in Thomspon School District Understanding RtI in Thomspon School District.

Open Forum: Scaling Up and Sustaining Interventions Moderator: Carol O'Donnell, NCER

How to Construct a Seasonal Index. Methods of Constructing a Seasonal Index  There are several ways to construct a seasonal index. The simplest is to.

Basic Concepts of Outcome-Informed Practice (OIP).

Training Strategies to Improve Accuracy Sayward E. Harrison, M.A./C.A.S. T. Chris Riley-Tillman, Ph.D. East Carolina University Sandra M. Chafouleas, Ph.D.

Crystal Reinhart, PhD & Beth Welbes, MSPH Center for Prevention Research and Development, University of Illinois at Urbana-Champaign Social Norms Theory.

Clayton R. Cook Diana Browning Wright. Purposes of Assessment Screening Who needs help? Diagnosis Why is the problem occurring? Progress Monitoring Is.

Project VIABLE - Direct Behavior Rating: Evaluating Behaviors with Positive and Negative Definitions Rose Jaffery 1, Albee T. Ongusco 3, Amy M. Briesch.

STAR Reading. Purpose Periodic progress monitoring assessment Quick and accurate estimates of reading comprehension Assessment of reading relative to.

Stages of Research and Development

Classroom Assessments Checklists, Rating Scales, and Rubrics

Evaluation Requirements for MSP and Characteristics of Designs to Estimate Impacts with Confidence Ellen Bobronnikov March 23, 2011.

Lab Roles and Lab Report

TREATMENT SENSITIVITY OF THE DYADIC PARENT-CHILD INTERACTION CODING SYSTEM-II Jenny Klein, B.S., Branlyn Werba, M.S., and Sheila Eyberg, Ph.D. University.

International And Cross-Cultural Application Of The Good Behavior Game

CHAPTER 2 Research Methods in Industrial/Organizational Psychology

Classroom Assessments Checklists, Rating Scales, and Rubrics

Integrating Outcomes Learning Community Call February 8, 2012

ABAB Design Ethical considerations

Sociology Outcomes Assessment

Administrator Evaluation Orientation

Using the Child and Family Outcomes Analysis Tools

Presentation transcript:

Generalizability and Dependability of Direct Behavior Ratings (DBRs) to Assess Social Behavior of Preschoolers Sandra M. Chafouleas 1, Theodore J. Christ 2, T. Chris Riley-Tillman 3, Amy M. Briesch 1, & Julie A.M. Chanese 1 University of Connecticut 1 University of Minnesota 2 East Carolina University 3 Although high quality behavior assessment tools exist for many purposes of assessment (e.g., screening, diagnosis, outcome evaluation), the same statement cannot be made regarding feasible tools available for the formative assessment of social behavior. Formative assessment becomes highly relevant when the goal is to have ongoing data regarding behavior in order to quickly modify an intervention strategy as appropriate. Thus, there is a need to establish reliable, valid, and feasible tools that can be customized to estimate a variety of social behaviors over time. One potentially feasible tool for use in the formative assessment of social behavior is the Direct Behavior Rating (DBR). Direct Behavior Ratings refer to a hybrid of assessment tools given that they combine characteristics of systematic direct observation and behavior rating scales. That is, when using a DBR, the rating process is similar to that of a behavior rating scale (e.g., On a scale of 1-6, how well did Johnny pay attention?), yet also similar to systematic direct observation given that the rating occurs following a specified shorter period of time (Chafouleas, Riley- Tillman, & Sugai, in press). Empirical support for the reliability of DBR use is limited and as such, we suggest that systematic empirical investigation of the psychometric properties of the DBR is needed. Thus, the purpose of this study was to provide preliminary psychometric data regarding the generalizability and dependability of the DBR for assessing the social behavior of preschoolers through investigation of the following questions: 1. What percentage of the variance in DBR ratings of social behavior in preschool students is accounted for by raters, time, and setting? 2. Is the DBR a reliable and valid method for assessing the social behavior of preschool students ? Introduction Participants included four female, Caucasian teachers working in the preschool classroom at a center affiliated with a large university located in the Northeast. In addition to the teachers serving as primary participants given their role as the observers, children attending the preschool also served as participants as their behavior was observed and recorded. The 15 students ranged in age from 3 years, 9 months, to 4 years, 9 months, with an average age of 4 years, 4 months. Thirteen of the children were Caucasian and 2 were Hispanic. The DBR created for use in this study included two social behaviors selected from preschool curricular goals and benchmarks provided by the associated state guidelines (Connecticut State Department of Education). The two behaviors included Works to Resolve Conflicts (WRC) and Interacts Cooperatively (IC) with Peers. When rating a student on each of the behaviors, teachers were asked to place an X on a continuous line (115 mm in length), indicating the proportion of time that exemplary behavior was observed during the observation period. Data were collected daily over 13 consecutive school days in late spring. Two 30-minute observations were conducted each day, with all four teachers observing and rating all student participants during the same time period (i.e., fully crossed design). In all, 2576 data points were collected. Subsequent to data collection, three major analyses were conducted. First, generalizability (G) theory was utilized to analyze the variance components of the full model. Next, a second set of G- studies were conducted to examine the DBR rating within rater. Finally, dependability (D) studies were conducted to examine the likely magnitude of generalizability coefficients, ρ2, and dependability coefficients, Φ, along with the magnitudes of relative SEM, Δ, and absolute SEM, δ Method Summary and Conclusions Results For additional information, please direct all correspondence to Sandra Chafouleas at: Chafouleas, S.M., Christ, T.J., Riley-Tillman, T.C., Briesch, A.M, & Chanese, J.A.M. (2007, March). Generalizability and dependability of Direct Behavior Ratings (DBRs) to assess social behavior of preschoolers. Poster presentation at the National Association of School Psychologists Annual Convention, New York, New York. Results Results of the G-studies suggested that: 1. Although the most substantial proportion of variance for both WRC and IC was attributed to person (18%, 38%), a fairly substantial proportion of measurement variance was attributable to the different raters (σ[raters] = 41% & 20%). That is, individual raters (i.e., teachers) tended to yield divergent ratings when the same person (i.e., student) was observed during the same interval. These inconsistencies in judgment of students’ WRC and IC should discourage the generalization of DBR ratings across raters. However, when the graphs presented in the above figure are visually compared, patterns among raters become apparent. A high degree of consistency was noted within and across students in the obtained profiles. Therefore, DBRs are not currently recommended for use in assessing behavior in relation to an absolute criterion; however, they do appear to have the potential to assist in intra-individual assessment. 2. When results were analyzed within raters, the proportion of variance attributed to person ranged from 30% to 63%, which represents an increase of 12% to 40% when compared to the original analysis. That is, when DBR ratings were analyzed within rater, the data were more indicative of the target student. 3. The percentage of variance accounted for with regard to day and setting was somewhat surprising. Both day and settings accounted for 0% of the variance, suggesting that these particular DBR ratings were not sensitive to small fluctuations in behavior across time or setting. This has important implications for the implementation of DBRs, in that behaviors that are more static might permit less frequent rating (e.g., weekly) while providing equally useful information. Future research is needed to discern those behaviors that are and are not likely to be more variable in similar and different types of settings, as well as to determine the optimal frequency with which ratings should be conducted. Results of the D-studies suggested that: 1.DBRs are likely to approximate or exceed a reliability coefficient of.70 after seven ratings have been collected across 4 to 7 days, or.90 after 10 DBR ratings have been collected. 2. The behavior “Interacts Cooperatively” consistently demonstrated better dependability coefficients, reduced SEMs, and the values were less rater dependent than the behavior “Works to Resolve Conflicts.” Future research should therefore investigate the dependability of the DBR when used to rate classroom behaviors that are both more commonly assessed (e.g., on-task behavior) and more discretely defined (e.g. raising hand). The most substantial proportions of measurement variance for both WRC and IC were attributed to person (18%, 38%) and rater (41%, 38%). Both day and settings accounted for 0% of the variance, suggesting that these particular DBR ratings were not sensitive to small fluctuations in behavior across time or setting. Comparison of DBR Ratings Across Teachers