Capacity and Fidelity Assessments: Advancements in Tools for Schools

Slides:



Advertisements
Similar presentations
Rhode Island Model for Educator Evaluation Systems August 2010.
Advertisements

By: Edith Leticia Cerda
TEMPLATE DESIGN © DE-PBS Key Features Evaluation: Matching Philosophy & Measurement Sarah K. Hearn, M.Ed., Delaware Positive.
Planning for Success: Using Implementation Data to Action Plan for Full and Sustained Implementation Barbara Sims Caryn Ward National SISEP Center National.
Proposal Writing.
Evaluation in Michigan’s Model Steve Goodman National PBIS Leadership Forum October, 2010
April 29, 2014 WA OSPI SISEP Active State State Capacity Assessment Results.
AICT5 – eProject Project Planning for ICT. Process Centre receives Scenario Group Work Scenario on website in October Assessment Window Individual Work.
Continuing QIAT Conversations Planning For Success Joan Breslin Larson Third webinar in a series of three follow up webinars for.
Striving for Quality Using continuous improvement strategies to increase program quality, implementation fidelity and durability Steve Goodman Director.
Barbara Sims, Co-Director National SISEP Center FPG Child Development Center University of North Carolina at Chapel Hill Greensboro.NC March 20, 2013 Implementation.
Comp 20 - Training & Instructional Design Unit 6 - Assessment This material was developed by Columbia University, funded by the Department of Health and.
1. Housekeeping Items June 8 th and 9 th put on calendar for 2 nd round of Iowa Core ***Shenandoah participants*** Module 6 training on March 24 th will.
DEVELOPING AN EVALUATION SYSTEM BOB ALGOZZINE AND STEVE GOODMAN National PBIS Leadership Forum Hyatt Regency O’Hare Rosemont, Illinois October 14, 2010.
June 22, 2011 CCSSO-NCSA Innovative Approaches to Statewide Writing Assessments 6/22/11CCSSO-NCSA.
FALCON Meeting #3 Preparation for Harnett County Schools Thursday, March 8, 2012.
Validity Validity is an overall evaluation that supports the intended interpretations, use, in consequences of the obtained scores. (McMillan 17)
Or How to Gain and Sustain a Competitive Advantage for Your Sales Team Key’s to Consistently High Performing Sales Organizations © by David R. Barnes Jr.
Notes for Trainers (Day Training)
Updated Section 31a Information LITERACY, CAREER/COLLEGE READINESS, MTSS.
V 2.1 Version 2.1 School-wide PBIS Tiered Fidelity Inventory.
LEARN. CARE. COMMUNITY. PNWU.edu Figure 1: Concept Map for IPE Fidelity 1.Determine the rubric score that represents high, medium, and low fidelity. 2.Identify.
Instructional Leadership Supporting Common Assessments.
Wisconsin Personnel Development System Grant Click on the speaker to listen to each slide. You may wish to follow along in your WPDM Guide.
Last Updated: 5/12/2016 Texas Teacher Evaluation and Support System (T-TESS) Teacher Overview.
Module 6: Coaching System
SAM (Self-Assessment of MTSS Implementation) ADMINISTRATION TRAINING
Stages of Research and Development
EVALUATING EPP-CREATED ASSESSMENTS
Outcomes Identify three categories of coaching activities needed to develop competence List the components of a Coaching Service Delivery Plan Define the.
Module 5: Communication Plan and Process for Addressing Barriers
Coaching and Supervision:
Module 7: Effective Innovation Alignment
Consider Your Audience
Management & Planning Tools
ISBE Mathematics Foundational Services Training
NC State Improvement Project
Annual Evaluation (TFI 1.15 )
Get Started – Get Better: Using Improvement Cycles for SSIP
Benchmarks of Quality (BOQ) Training
The Year of Core Instruction
Florida’s MTSS Project: Self-Assessment of MTSS (SAM)
School-wide PBIS Tiered Fidelity Inventory
Continuous Improvement through Accreditation AdvancED ESA Accreditation MAISA Conference January 27, 2016.
Overview – Guide to Developing Safety Improvement Plan
Laurene Christensen, Ph.D. Linda Goldstone, M.S.
CEEDAR-IRIS Cross State Convening
Overview – Guide to Developing Safety Improvement Plan
SAPSI-S PEP Overview I-RtI Network December, 2012
TFI Wordle This presentation is intended to introduce the PBISApps site and the types of data teams will be working with. The teams will take their first.
Safety Culture Self-Assessment Methodology
Engagement Follow-up Resources
Thank you for agreeing to complete the Benchmarks of Quality
Implementation Guide for Linking Adults to Opportunity
Mary Weck, Ed. D Danielson Group Member
Introduction to Student Achievement Objectives
Texas Performance Standards Project
Job Analysis CHAPTER FOUR Screen graphics created by:
Chicago Public Schools
Writing the Introduction
Engagement Follow-up Resources
K–8 Session 1: Exploring the Critical Areas
Jeanie Behrend, FAST Coordinator Janine Quisenberry, FAST Assistant
Deconstructing Standard 2a Dr. Julie Reffel Valdosta State University
PUBLIC SCHOOL CHOICE RENEWAL PROCESS
Installation Stage and Implementation Analysis
Facilitated/Presented by:
AICT5 – eProject Project Planning for ICT
Roadmap November 2011 Revised March 2012
Building an Informatics-Savvy Health Department
Presentation transcript:

Capacity and Fidelity Assessments: Advancements in Tools for Schools Christine Russell, Ed.D. crussell@miblsimtss.org Evaluation and Research Specialist for MiBLSi Caryn Sabourin Ward, Ph.D. caryn.ward@unc.edu Sr Implementation Specialist/Scientist

Learning Objective Understand innovative practices applied to the validation of: Capacity Assessments Regional Capacity Assessment (RCA) District Capacity Assessment (DCA) Fidelity Assessment Reading Tiered Fidelity Inventory (R-TFI)

Regional Capacity Assessment (RCA) Used by Regional Education Agencies at least 2 times a year Assessment of a regional education agency’s systems, activities and resources necessary for the regional education agency to successfully support district-level implementation of Effective Innovations District Capacity Assessment (DCA) Used by District Implementation Teams at least 2 times a year Assessment of the district’s systems, activities and resources necessary for schools to successfully adopt and sustain Effective Innovations Reading Tiered Fidelity Inventory (R-TFI) Elementary and Secondary Versions Fidelity Assessment Used by School Implementation Teams at least annually Assess the implementation of a School-Wide Reading Model encompassing (1) evidence-based practices focused on the Big Ideas of Reading (2) systems to address the continuum of reading needs across the student body and (3) data use and analysis

Regional Capacity Assessment (RCA) District Capacity Assessment (DCA)

Reading Tiered Fidelity Inventory (R-TFI) Elementary Version

Administration Process Self-assessment completed by a team with a trained external administrator and local facilitator 1-2 hours in length Consensus scoring

Example Item and Rubric

Consider How have you experienced enhanced action planning through the use of any of the following NIRN Capacity Measures? State Capacity Assessment (SCA) Regional Capacity Assessment (RCA) District Capacity Assessment (DCA) Drivers Best Practices Assessment How have you experienced enhanced action planning through the use of fidelity assessments?

Focus on Content Validation Confident the assessment correlates with positive outcomes Quality to which the underlying construct is measured Accuracy and meaning of the assessment results Without validation, a self-assessment is more of a checklist or support tool rather than a measure that we are confident in

Validation Process Classic Model Modern Approach Validity Test Content Response Process Internal Structure Relationship to Other Variables Consequence of Testing Content Criterion Construct Validity

Example Methodologies Sources of Validity Description Example Methodologies Test Content Instrument characteristics such as themes, wording, format of items, tasks, questions, instructions, guidelines and procedures for administration and scoring Basis for items/literature review Qualification of authors and reviews Item writing process Review by panel of experts Vetting and editing process Response Process Fit between the items and process engaged in by those using the assessment Think Aloud Protocols Internal Structure Considers the relationships among items and test components compared to test constructs Factor and Rasch analysis Relationship to Other Variables Relationship of test scores to variables external to the test Relationship between a test score and an outcome Predictive evidence Concurrent evidence Convergent evidence Divergent Consequence of Testing Intended and unintended consequences of test use Purpose, use and outcomes of test administration including arguments for and against Sources of Validity Rational for focusing on test content: Test content represents the extent to which the items adequately sample the construct. (Gable and Wolf 1994) Gathering evidence of test content establishes the appropriateness of the conceptual framework and how well the items represent the construct (Sireci & Faulkner-Bond, 2014)

Construct Definition, Item Generation Phase 1 Construct Definition, Item Generation Phase 2 Test Content Validation - Survey Protocol Phase 3 Response Process Validity - Think Aloud Protocol Phase 4 Usability and Refinement

Construct Definition, Item Construction Experts utilized: Previous iterations of similar assessments Feedback from administrators and practitioners who had experience with similar assessments RCA/DCA: Advancements within implementation science and systems change R-TFI: Advancements within the field of school-wide reading practices

Content Validity Survey Elements Suggested by Haynes et. al, 1995 Array of items selected (questions, codes, measures) Precision of wording or definition of individual items Item response from (e.g. scale) Sequence of items or stimuli Instructions to participants Temporal parameters of responses (interval of interest; timed vs. untimed) Situations sampled Behavior or events sampled Components of an aggregate, factor, response class Method and standardization of administration Scoring, data reduction, item weighting Definition of domain and construct Method-mode match Function-instrument match Array of items selected (questions, codes, measures) Precision of wording or definition of individual items Item response from (e.g. scale) Sequence of items or stimuli Instructions to participants Temporal parameters of responses (interval of interest; timed vs. untimed) Situations sampled Behavior or events sampled Components of an aggregate, factor, response class Method and standardization of administration Scoring, data reduction, item weighting Definition of domain and construct Method-mode match Function-instrument match

4-Part Content Validation Survey Protocol Section #1: Consent and Edits Consent Form and Opt in/out of listing as contributor Downloadable word version of the assessment Upload assessment with edits, suggestions, questions provided through track changes Section #2: Item Analysis Attainability and Importance of each item rated on a 3-point scale Opportunity to select the 5 most critical items Section #3: Construct Comprehensiveness and clarity of each construct definition rated on a 3-point scale Open-ended comments on construct definition Best fit for each item with an Implementation Driver or area of a School-Wide Reading Model Section #4: Sequencing Frequency and Format Suggestions for reordering items Suggestions for frequency of administered Comprehensiveness and clarity of each section rated on a 3-point scale Open-ended comments on sections of the assessment If participant had previous experience administering a similar assessment asked: Whether current version is an improvement from previous version(s) Input on what benefits have been experienced using similar assessments in the past

Survey Participants Total Researchers/National Technical Assistance Providers State/Regional Technical Assistance Providers District Practitioners RCA 23 4 15 DCA 34 19 11 R-TFI Elementary 10 6 - The number of participants suggested for a content validation survey varies from 2-20 (Gable & Wolf, 1993; Grant & Davis, 1997; Lynn, 1986; Tilden, Nelson, & May, 1990; Waltz, Strickland, & Lenz, 1991).

Minutes Spent Completing Survey Survey #1: Consent and Edits Survey #2: Item Analysis Survey #3: Construct Survey #4: Sequencing Frequency and Format Total RCA Average 162 Range (70 – 300) 23 (10 - 50) 27 (6 – 50) 18 (5 - 45) 230 (140 - 390) DCA 89 (23 – 200) 26 (10 - 60) (5 - 75) 20 (6 - 60) 157 (40 - 275) R-TFI Elementary 99 (5 – 180) 12 (5 - 20) 19 (10 - 35) 9 (5 - 25) 135 (30 - 215)

Content Validation Did we improve the assessment compared to comparable or previous assessments? Are the definitions of the constructs clear and useable? How frequently should this be used to assess? Are the sections of the assessment comprehensive and clear? Item Analysis Does the item fit the content domain? How relevant/important is the item for the domain? What edits are needed to the item and rubric?

DCA Content Validation Results – Improvements Compared to Other Measures 76.5% (n=16) of respondents had previously completed a similar assessment Level of improvement = 8 (on scale of 0-10) Described the DCA as: Streamlined Shorter, more concise items Improved due to use of a rubric

DCA Content Validation Results – Construct Definitions Decision Cut Decision Rule Average rating of less than 2.5 for comprehensiveness or clarity Results Will revise definitions based on comments Comprehensive Met Threshold Clarity Met Threshold Revisions Capacity Yes No revisions Competency Leadership No Rewrote definition for increased clarity Organization

DCA Content Validation Results – Frequency of Assessment Decision Cut Decision Rule More than 70% of respondents suggest one option for frequency Results Use the recommendation as suggested frequency Did not meet criteria Used majority response = assess twice annually Frequency – comments that at later stages of implementation, less assessment may be appropriate.

DCA Content Validation Results – Comprehensive and Clear Sections Decision Cut Decision Rule Average rating of less than 2.5 for comprehensiveness or clarity Will revise sections based on comments Results All sections met the threshold for both comprehensiveness and clarity. Sections were revised based on feedback within the track changes documents to increase ease of use and correct any edits needed

DCA Content Validation Results – Item Analysis Decision Cut Decision Rule Importance 2.5 Content Validity Index (CVI) At or Above 2.5 - Eliminate or substantially change the item Below 2.5 - Decision to accept an edit or address a comment/question made based on whether the suggestions enhance the clarity of the item # of Times Rated as Top 5 Most Important Used to further validate CVI rating Attainability Less than 1.5 CVI Develop an action plan to create resources to assist teams with action planning and attaining item CVI – Content Validity Index – Average of scores.

DCA Content Validation Results – Item Analysis Importance - 1 item below 2.5 threshold Attainability - 11 items below 2.5 threshold Item revisions Combined 2 items Deleted 1 item Edits occurred within the rubric for each item based on suggestions

DCA Content Validation Results – Item Analysis

Consider Reflect on the low scores on attainability for items related to the Competency Driver. How does this matches with what you see in your work with teams and schools? How does this finding relate to the work that you are doing to develop supports and resources for teams to sustain their work?

DCA Content Validation Results – Item Match with Constructs DCA Decision Rules and Results Decision Cut Decision Rule Over 70% of respondents align an item with a construct Item will be housed within that construct Less that 70% of respondents align an item within one clear construct Authors will use results, comments and personal knowledge of the constructs to map an item to a construct Results Met 70% criteria = 3 items 50%-70% aligned with on construct = 20 items Below 50% = 3 items Decision = Use author knowledge along with comments to map items to constructs

DCA Content Validation Results – Sequencing of Items Decision Cut Decision Rule More than 50% of respondents suggest moving an item Consider a revised location/order for the item Results 77%, no reordering suggestions. Minor reordering occurred based on comments and due to assessment edits

Think Aloud Procedure “We are going to ask you to read portions of the document aloud. The purpose of reading aloud is to ensure clarity and ease of reading the measure. This process will allow us to capture any areas where wording needs to be adjusted. As you read please verbalize any thoughts, reactions, or questions that are running through your mind. Please act and talk as if you are talking to yourself and be completely natural and honest about your rating process and reactions. Also, feel free to take as long as needed to adequately verbalize.” Exerpt from the Think Aloud protocol. Participants: RCA 4, DCA 4, R-TFI 3.

Changes to R-TFI from Think Aloud Rewording of introduction and directions Added before, during and after administration sections Changed from the wording from “Subscales” to “Tiers” Changed sequence of items in the Tier 2 section Included additional words in the glossary Found some issues with consistency of terms Identified items that are not applicable to school decision making and instead should be asked at a district level

Usability Testing Type of Improvement Cycle based on PDSA-Cycle (Deming) Plan – What did you intend to do? Do - Did you do it? Study – What happened? Act – What can be changed and improved? Cycle

Usability Testing A planned series of tests to refine and improve the assessment and the administration processes.  Used proactively to test the feasibility and impact of a new way of work prior to rolling out the assessment or administration processes more broadly More is learned from 4 cycles with 5 participants each than from 1 pilot test with 20 participants Cycle Cycle The idea is to use the PDSA processes with small groups of 4 or 5 administrations. Plan to use Capacity Assessment Do – Engage in training and conducting the assessments Study – Debrief as team and identify successes, changes and improvements needed on process of utilizing tool (training, introducing to respondents, process of conducting the assessments, using the results etc.) Act - Apply those changes to the next set of users. Repeat the process (PDSA) with the the next set of administrations 4 or 5 times.

Usability Testing Assessed 5 areas with respective goals: Communication & Preparation Administration Protocol Items & Scoring Rubric Participant Response Training Implications Data collected via survey from administrators and facilitators for each administration in each cohort.

Usability Testing: RCA Cohort 1: N = 4 administrations across 2 states Cohort 2: N = 5 administrations across 3 states Cohort 3: N = 6 administrations across 3 states Cohort 4: N = 6 administrations across 4 states Core Team

Results of Usability Testing for the RCA Number of improvements in each of the five areas decreased over the cycles and all goals were met Area Examples of Improvement Communication & Preparation More guidance developed around team composition and respondents Administration Protocol 100% on fidelity protocol and rating of importance (4 or higher) Items & Scoring Rubric Minor wording changes to items; sequencing of items was reviewed but not changed Training Implications Facilitation skills identified; prioritization of areas for action planning Participant Response Engaged and positive

Consider What barriers and facilitators have you encountered when engaging in usability testing? How has usability testing helped to refine your measurement development work? How has usability testing helped to identify areas of strength and gap areas in your assessments?

Improvements to Validity Process Use of track changes within the actual assessment tool as a method for providing feedback Clear decision rules for item revisions Lengthy survey broken down into manageable segments Use of Response Process/Think Aloud Protocol to further refine assessment Usability testing used to refine and improve measurement tool and assessment processes

Lessons Learned Content Validation Survey Think Aloud Organize feedback into “Quick Edits” , “Questions” , “Comments” Track and report positive comments Think Aloud Have participants begin at different sections so fatigue doesn’t impact quality of feedback on later questions/sections of the tool PDSA Cycles – Usability Testing Requires: Discipline to have a plan and stick to it Studying and acting (plan-.‐do, plan-.‐do, plan-.‐do = my colleague saying …) Repetition until the goal is reached or the problem is solved (my other colleague says - often thwarted, never stymied)

Importance of Content Validation and PDSA in Assessment Development Content Validation Survey Critical first step in validation of a measure Think Aloud Creates a well edited product prior to publishing PDSA Cycles – Usability Testing: Start Small and Get Better before Extensive Roll Out Very efficient way to develop, test, and refine the measure and its use in practice

More Information MiBLSi http://miblsi.cenmi.org/ Evaluation and Measurement Page http://miblsi.cenmi.org/MiBLSiModel/Evaluation/Measures.aspx DCA Technical Manual Active Implementation Hub Open Access Learning http://implementation.fpg.unc.edu

Citation and Copyright This document is based on the work of the National Implementation Research Network (NIRN). © 2013-2016 Allison Metz, Leah Bartley, Jonathan Green, Laura Louison, Sandy Naoom, Barbara Sims, and Caryn Ward This content is licensed under Creative Commons license CC BY-NC-ND, Attribution-NonCommercial-NoDerivs . You are free to share, copy, distribute and transmit the work under the following conditions: Attribution — You must attribute the work in the manner specified by the author or licensor (but not in any way that suggests that they endorse you or your use of the work); Noncommercial — You may not use this work for commercial purposes; No Derivative Works — You may not alter, transform, or build upon this work. Any of the above conditions can be waived if you get permission from the copyright holder. email: nirn@unc.edu web: http://nirn.fpg.unc.edu The mission of the National Implementation Research Network (NIRN) is to contribute to the best practices and science of implementation, organization change, and system reinvention to improve outcomes across the spectrum of human services.