Developing an evaluation of professional development Webinar #2: Going deeper into planning the design 1.

Slides:



Advertisements
Similar presentations
Study Size Planning for Observational Comparative Effectiveness Research Prepared for: Agency for Healthcare Research and Quality (AHRQ)
Advertisements

A Guide to Education Research in the Era of NCLB Brian Jacob University of Michigan December 5, 2007.
How Do We Know if a Charter School is Really Succeeding? – Various Approaches to Investigating School Effectiveness October 2012 Missouri Charter Public.
Designs to Estimate Impacts of MSP Projects with Confidence. Ellen Bobronnikov March 29, 2010.
Experimental Research Designs
Population Sampling in Research PE 357. Participants? The research question will dictate the type of participants selected for the study Also need to.
Estimation and Reporting of Heterogeneity of Treatment Effects in Observational Comparative Effectiveness Research Prepared for: Agency for Healthcare.
Validity, Sampling & Experimental Control Psych 231: Research Methods in Psychology.
SOWK 6003 Social Work Research Week 4 Research process, variables, hypothesis, and research designs By Dr. Paul Wong.
Causal Comparative Research: Purpose
Chapter 7 Correlational Research Gay, Mills, and Airasian
Chapter 5 Copyright © Allyn & Bacon 2008 This multimedia product and its contents are protected under copyright law. The following are prohibited by law:
Validity Lecture Overview Overview of the concept Different types of validity Threats to validity and strategies for handling them Examples of validity.
Copyright c 2001 The McGraw-Hill Companies, Inc.1 Chapter 7 Sampling, Significance Levels, and Hypothesis Testing Three scientific traditions critical.
Evaluation of Math-Science Partnership Projects (or how to find out if you’re really getting your money’s worth)
Chapter 12 Inferential Statistics Gay, Mills, and Airasian
Experimental Design The Gold Standard?.
Preliminary Results – Not for Citation Investing in Innovation (i3) Fund Evidence & Evaluation Webinar May 2014 Note: These slides are intended as guidance.
Chapter 4 Hypothesis Testing, Power, and Control: A Review of the Basics.
Overview of MSP Evaluation Rubric Gary Silverstein, Westat MSP Regional Conference San Francisco, February 13-15, 2008.
Chapter 1: Introduction to Statistics
Research Design for Quantitative Studies
Moving from Development to Efficacy & Intervention Fidelity Topics National Center for Special Education Research Grantee Meeting: June 28, 2010.
Epidemiology The Basics Only… Adapted with permission from a class presentation developed by Dr. Charles Lynch – University of Iowa, Iowa City.
Analyzing Reliability and Validity in Outcomes Assessment (Part 1) Robert W. Lingard and Deborah K. van Alphen California State University, Northridge.
1 Copyright © 2011 by Saunders, an imprint of Elsevier Inc. Chapter 9 Examining Populations and Samples in Research.
Final Study Guide Research Design. Experimental Research.
Overview of Evaluation Designs. Learning objectives By the end of this presentation, you will be able to: Explain evaluation design Describe the differences.
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
Systematic Review Module 7: Rating the Quality of Individual Studies Meera Viswanathan, PhD RTI-UNC EPC.
Techniques of research control: -Extraneous variables (confounding) are: The variables which could have an unwanted effect on the dependent variable under.
Assisting GPRA Report for MSP Xiaodong Zhang, Westat MSP Regional Conference Miami, January 7-9, 2008.
What Works Clearinghouse Susan Sanchez Institute of Education Sciences.
Sampling “Sampling is the process of choosing sample which is a group of people, items and objects. That are taken from population for measurement and.
Preliminary Results – Not for Citation Strengthening Institutions Program Webinar on Competitive Priority on Evidence April 11, 2012 Note: These slides.
CAUSAL INFERENCE Presented by: Dan Dowhower Alysia Cohen H 615 Friday, October 4, 2013.
Preliminary Results – Not for Citation Investing in Innovation (i3) Fund Evidence & Evaluation Webinar April 25, 2012 Note: These slides are intended as.
Chapter 4 – Research Methods in Clinical Psych Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
For ABA Importance of Individual Subjects Enables applied behavior analysts to discover and refine effective interventions for socially significant behaviors.
Rigorous Quasi-Experimental Evaluations: Design Considerations Sung-Woo Cho, Ph.D. June 11, 2015 Success from the Start: Round 4 Convening US Department.
Classifying Designs of MSP Evaluations Lessons Learned and Recommendations Barbara E. Lovitts June 11, 2008.
Evaluating Impacts of MSP Grants Hilary Rhodes, PhD Ellen Bobronnikov February 22, 2010 Common Issues and Recommendations.
WWC Standards for Regression Discontinuity Study Designs June 2010 Presentation to the IES Research Conference John Deke ● Jill Constantine.
November 15, Regional Educational Laboratory - Southwest The Effects of Teacher Professional Development on Student Achievement: Finding from a Systematic.
One-Way Analysis of Covariance (ANCOVA)
Evaluating Impacts of MSP Grants Ellen Bobronnikov Hilary Rhodes January 11, 2010 Common Issues and Recommendations.
Validity Validity is an overall evaluation that supports the intended interpretations, use, in consequences of the obtained scores. (McMillan 17)
McMillan Educational Research: Fundamentals for the Consumer, 6e © 2012 Pearson Education, Inc. All rights reserved. Educational Research: Fundamentals.
Experimental Research Methods in Language Learning Chapter 5 Validity in Experimental Research.
© 2006 by The McGraw-Hill Companies, Inc. All rights reserved. 1 Chapter 7 Sampling, Significance Levels, and Hypothesis Testing Three scientific traditions.
Evaluating Impacts of MSP Grants Ellen Bobronnikov January 6, 2009 Common Issues and Potential Solutions.
Securing External Federal Funding Janice F. Almasi, Ph.D. Carol Lee Robertson Endowed Professor of Literacy University of Kentucky
An Expanded Model of Evidence-based Practice in Special Education Randy Keyworth Jack States Ronnie Detrich Wing Institute.
Evaluation Requirements for MSP and Characteristics of Designs to Estimate Impacts with Confidence Ellen Bobronnikov February 16, 2011.
Chapter 6 Conducting & Reading Research Baumgartner et al Chapter 6 Selection of Research Participants: Sampling Procedures.
Developing an evaluation of professional development Webinar #4: Going Deeper into Analyzing Results 1.
Characteristics of Studies that might Meet the What Works Clearinghouse Standards: Tips on What to Look For 1.
How Psychologists Do Research Chapter 2. How Psychologists Do Research What makes psychological research scientific? Research Methods Descriptive studies.
Preliminary Results – Not for Citation Investing in Innovation (i3) Fund Evidence & Evaluation Webinar 2015 Update Note: These slides are intended as guidance.
Developing an evaluation of professional development Overview session: Critical elements of professional development planning and evaluation 1.
Introduction to General Epidemiology (2) By: Dr. Khalid El Tohami.
June 25, Regional Educational Laboratory - Southwest Review of Evidence on the Effects of Teacher Professional Development on Student Achievement:
CRITICALLY APPRAISING EVIDENCE Lisa Broughton, PhD, RN, CCRN.
Chapter 12 Quantitative Questions and Procedures.
Understanding Populations & Samples
Evaluation Requirements for MSP and Characteristics of Designs to Estimate Impacts with Confidence Ellen Bobronnikov March 23, 2011.
Chapter 11: Quasi-Experimental and Single Case Experimental Designs
Reliability and Validity of Measurement
BU Career Development Grant Writing Course- Session 3, Approach
Group Experimental Design
Presentation transcript:

Developing an evaluation of professional development Webinar #2: Going deeper into planning the design 1

Information and materials mentioned or shown during this presentation are provided as resources and examples for the viewer's convenience. Their inclusion is not intended as an endorsement by the Regional Educational Laboratory Southeast or its funding source, the Institute of Education Sciences (Contract ED-IES-12-C-0011). In addition, the instructional practices and assessments discussed or shown in these presentations are not intended to mandate, direct, or control a State’s, local educational agency’s, or school’s specific instructional content, academic achievement system and assessments, curriculum, or program of instruction. State and local programs may use any instructional content, achievement system and assessments, curriculum, or program of instruction they wish. 2

Purpose & Audience In scope Evaluation designs that allow for causal inferences (RCT & QED) Creating an evaluation plan to examine the effectiveness of professional development Out of scope Other program evaluation designs Identifying best practices for conducting professional development Identifying best practices in systems change 3 Target audience: LEAs, SEAs, and researchers interested in creating an evaluation of a specific professional development program and have an intermediate level of understanding in effectiveness studies.

PLANNING THE DESIGN Dr. Sharon Koon 4

Distinction between WWC evidence standards and additional qualities of strong studies WWC design considerations for assessing effectiveness research: – Two distinct groups—a treatment group (T) and a comparison group (C). – For randomized controlled trials (RCTs), low attrition for both the T and C groups. – For quasi-experimental designs (QEDs), baseline equivalence between T and C groups. – Contrast between T and C groups measures impact of the treatment. – Valid and reliable outcome data used to measure the impact of a treatment. – No known confounding factors. – Outcome(s) not overaligned with the treatment. – Same data collection process—same instruments, same time/year—for the T and C groups. 5 Source: Experiments-in-Education-Version-2.pdfhttp:// Experiments-in-Education-Version-2.pdf

Distinction between WWC evidence standards and additional qualities of strong studies (cont.) Additional qualities of strong studies: – Pre-specified and clear primary and secondary research questions. – Generalizability of the study results. – Clear criteria for research sample eligibility and matching methods. – Sample size large enough to detect meaningful and statistically significant differences between the T and C groups overall and for specific subgroups of interest. – Analysis methods reflect the research questions, design, and sample selection procedures. – A clear plan to document the implementation experiences of the T and C conditions. 6 Source: Experiments-in-Education-Version-2.pdfhttp:// Experiments-in-Education-Version-2.pdf

Determinants of a What Works Clearinghouse (WWC) study rating Source:

Study features that will be discussed Randomized controlled trials (RCTs) – Random assignment process Cluster-level RCT considerations – Attrition, both overall and T-C differential Quasi-experimental designs (QEDs) and high-attrition RCTs – Baseline equivalence For both RCTs and QEDs – Confounding factors – Outcome eligibility Power analysis (not considered by WWC evidence standards) Source: WWC references -

Random assignment process Units can be assigned at any level and at multiple levels (e.g., schools, teachers, students) – Cluster design: when groups rather than individuals are the unit of assignment Make sure the units are – Assigned entirely by chance – Have a non-zero probability of being assigned to each group (but can have different probabilities across conditions) – Have consistent assignment probability within group or use an appropriate analytic approach Can be useful to conduct within strata Must maintain assignment status in the analysis, even if noncompliance occurs (i.e, intent-to-treat analysis) Source:

Cluster-level RCT considerations When cluster-level outcomes are analyzed, results provide evidence about cluster-level effects To meet WWC standards without reservations for analyses of subcluster effects, the sample should include subcluster units identified before the results of the random assignment were revealed. For example, in a school-level RCT examining teacher retention, the sample – Should include teachers in the schools before the random assignment results were provided to the schools – Cannot meet standards without reservations if it includes any teachers who joined the schools after the random assignment results were provided Source:

Attrition Occurs when sample members initially assigned to T or C groups are not in the analysis because they are missing key data used to calculate impacts The WWC is concerned about overall attrition and differences in the attrition rates between T and C groups WWC examines cluster and, if applicable, subcluster attrition Key data include outcomes, and for high-attrition RCTs, characteristics used to assess baseline equivalence Source:

Ways of minimizing attrition in RCTs Make sure study participation activities are clear to everyone involved – e.g., can prevent an uninformed superintendent from pulling the plug Conduct random assignment after participants consented to study participation – Non-consent counts as attrition Conduct random assignment as close to the start of the implementation period as possible – Could help minimize attrition turnover Source:

QEDs In a QED, there are at least two groups (one intervention and one comparison). The groups are created non-randomly – Use a convenience sample, or nonparticipants who are nearby and available, but are not participating in the intervention. – Use a statistical technique to match participants (e.g., propensity score matching). – Form the groups retrospectively, using administrative data. Source:

Baseline equivalence Must be demonstrated for QEDs and high- attrition RCTs Based on units/individuals in the analytic sample using baseline characteristics Example baseline characteristics: – Prior measure of the outcome – Demographic characteristics related to the outcome of interest Sources:

Baseline equivalence (cont.) Calculate the T-C standardized mean difference at baseline – Differences between 0.05 and 0.25 standard deviations require statistical adjustment when calculating impacts – If there is a difference greater than 0.25 standard deviations for any required characteristic, then no outcomes in that domain may meet standards Sources:

Confounding factors Common confounds – Single unit (school, classroom, teacher) in one or both conditions) – Characteristics of the units in each group differ systematically in ways that are associated with the outcomes – Intervention is bundled with other services not being studied – T and C occur at different points in time Source:

Outcome eligibility Face validity and reliability. Minimum reliability standards include: – internal consistency (such as Cronbach’s alpha) of 0.50 or higher; – temporal stability/test-retest reliability of 0.40 or higher; or – inter-rater reliability (such as percentage agreement, correlation, or kappa) of 0.50 or higher. Not overaligned – E.g., an outcome measure based on an assessment that relied on materials used in the T condition but not in the C condition (e.g., specific reading passages) Source:

Outcome eligibility (cont.) Collected in the same manner for both T and C groups. Issues include: – different modes, timing, or personnel were used for the groups – measures were constructed differently for the groups Source:

Power analysis Power: the probability of finding a difference when there is a true difference in the populations (i.e., correctly rejecting a false null hypothesis). Key variables influence the power of a statistical test: – The alpha that a researcher chooses – The magnitude of the true population effect (effect size) – The sample size – Any clustering of the data – The extent to which baseline covariates predict the outcome variable 19

Power analysis (cont.) A priori power analysis is conducted prior to doing the study. It enables you to design a study with adequate statistical power. Several online tools are available to researchers. For example, Optimal Design, can be used for individual and group RCTs

Questions & Answers Homework: Find psychometric properties of outcome measures you are considering Bring questions to sessions

Developing an evaluation of professional development Webinar 3: Going Deeper into Identifying & Measuring Target Outcomes 1/15/2016, 2:00pm Webinar 4: Going Deeper into Analyzing Results 1/19/2016, 2:00pm Webinar 5: Going Deeper into Interpreting Results & Presenting Findings 1/21/2016, 2:00pm 22