Evaluating Impacts of MSP Grants Ellen Bobronnikov Hilary Rhodes January 11, 2010 Common Issues and Recommendations.

Slides:



Advertisements
Similar presentations
Educational Research: Causal-Comparative Studies
Advertisements

© LOUIS COHEN, LAWRENCE MANION & KEITH MORRISON
Experimental Designs Dr. Farzin Madjidi Pepperdine University
Defining Characteristics
Designs to Estimate Impacts of MSP Projects with Confidence. Ellen Bobronnikov March 29, 2010.
Experimental Research Designs
Issues of Technical Adequacy in Measuring Student Growth for Educator Effectiveness Stanley Rabinowitz, Ph.D. Director, Assessment & Standards Development.
MSP Evaluation Rubric and Working Definitions Xiaodong Zhang, PhD, Westat Annual State Coordinators Meeting Washington, DC, June 10-12, 2008.
Correlation AND EXPERIMENTAL DESIGN
Who are the participants? Creating a Quality Sample 47:269: Research Methods I Dr. Leonard March 22, 2010.
Beginning the Research Design
Causal Comparative Research: Purpose
Questions What is the best way to avoid order effects while doing within subjects design? We talked about people becoming more depressed during a treatment.
Chapter 7 Correlational Research Gay, Mills, and Airasian
Course Content Introduction to the Research Process
Chapter 5 Copyright © Allyn & Bacon 2008 This multimedia product and its contents are protected under copyright law. The following are prohibited by law:
Chapter 14 Inferential Data Analysis
Chapter 9 Experimental Research Gay, Mills, and Airasian
Nasih Jaber Ali Scientific and disciplined inquiry is an orderly process, involving: problem Recognition and identification of a topic to.
Chapter 8 Experimental Research
Experimental Design The Gold Standard?.
I want to test a wound treatment or educational program but I have no funding or resources, How do I do it? Implementing & evaluating wound research conducted.
Preliminary Results – Not for Citation Investing in Innovation (i3) Fund Evidence & Evaluation Webinar May 2014 Note: These slides are intended as guidance.
Collecting Quantitative Data
COLLECTING QUANTITATIVE DATA: Sampling and Data collection
Research and Evaluation Center Jeffrey A. Butts John Jay College of Criminal Justice City University of New York August 7, 2012 How Researchers Generate.
I want to test a wound treatment or educational program in my clinical setting with patient groups that are convenient or that already exist, How do I.
Overview of MSP Evaluation Rubric Gary Silverstein, Westat MSP Regional Conference San Francisco, February 13-15, 2008.
Selecting a Research Design. Research Design Refers to the outline, plan, or strategy specifying the procedure to be used in answering research questions.
Day 6: Non-Experimental & Experimental Design
Moving from Development to Efficacy & Intervention Fidelity Topics National Center for Special Education Research Grantee Meeting: June 28, 2010.
Standardization and Test Development Nisrin Alqatarneh MSc. Occupational therapy.
Classroom Assessments Checklists, Rating Scales, and Rubrics
Evaluating a Research Report
Systematic Review Module 7: Rating the Quality of Individual Studies Meera Viswanathan, PhD RTI-UNC EPC.
Assisting GPRA Report for MSP Xiaodong Zhang, Westat MSP Regional Conference Miami, January 7-9, 2008.
1 Experimental Research Cause + Effect Manipulation Control.
CDIS 5400 Dr Brenda Louw 2010 Validity Issues in Research Design.
An Introduction to Statistics and Research Design
Classifying Designs of MSP Evaluations Lessons Learned and Recommendations Barbara E. Lovitts June 11, 2008.
Evaluating Impacts of MSP Grants Hilary Rhodes, PhD Ellen Bobronnikov February 22, 2010 Common Issues and Recommendations.
Introduction section of article
Notes 1.3 (Part 1) An Overview of Statistics. What you will learn 1. How to design a statistical study 2. How to collect data by taking a census, using.
Chapter 10 Experimental Research Gay, Mills, and Airasian 10th Edition
Research Design ED 592A Fall Research Concepts 1. Quantitative vs. Qualitative & Mixed Methods 2. Sampling 3. Instrumentation 4. Validity and Reliability.
McMillan Educational Research: Fundamentals for the Consumer, 6e © 2012 Pearson Education, Inc. All rights reserved. Educational Research: Fundamentals.
Chapter 10 Finding Relationships Among Variables: Non-Experimental Research.
1 f02laitenberger7 An Internally Replicated Quasi- Experimental Comparison of Checklist and Perspective-Based Reading of Code Documents Laitenberger, etal.
Ch 9 Internal and External Validity. Validity  The quality of the instruments used in the research study  Will the reader believe what they are readying.
1 Module 3 Designs. 2 Family Health Project: Exercise Review Discuss the Family Health Case and these questions. Consider how gender issues influence.
Evaluation Designs Adrienne DiTommaso, MPA, CNCS Office of Research and Evaluation.
Chapter 11.  The general plan for carrying out a study where the independent variable is changed  Determines the internal validity  Should provide.
Evaluating Impacts of MSP Grants Ellen Bobronnikov January 6, 2009 Common Issues and Potential Solutions.
Assessing Responsiveness of Health Measurements Ian McDowell, INTA, Santiago, March 20, 2001.
Evaluation Requirements for MSP and Characteristics of Designs to Estimate Impacts with Confidence Ellen Bobronnikov February 16, 2011.
Chapter 6 Conducting & Reading Research Baumgartner et al Chapter 6 Selection of Research Participants: Sampling Procedures.
Chapter Eight: Quantitative Methods
Developing an evaluation of professional development Webinar #2: Going deeper into planning the design 1.
Characteristics of Studies that might Meet the What Works Clearinghouse Standards: Tips on What to Look For 1.
© 2006 by The McGraw-Hill Companies, Inc. All rights reserved. 1 Chapter 11 Testing for Differences Differences betweens groups or categories of the independent.
Preliminary Results – Not for Citation Investing in Innovation (i3) Fund Evidence & Evaluation Webinar 2015 Update Note: These slides are intended as guidance.
CHOOSING A RESEARCH DESIGN
Evaluation Requirements for MSP and Characteristics of Designs to Estimate Impacts with Confidence Ellen Bobronnikov March 23, 2011.
Chapter 11: Quasi-Experimental and Single Case Experimental Designs
Research Designs, Threats to Validity and the Hierarchy of Evidence and Appraisal of Limitations (HEAL) Grading System.
Critical Reading of Clinical Study Results
Chapter Eight: Quantitative Methods
Introduction to Design
Quantitative Research
Reminder for next week CUELT Conference.
Presentation transcript:

Evaluating Impacts of MSP Grants Ellen Bobronnikov Hilary Rhodes January 11, 2010 Common Issues and Recommendations

2 Overview Purpose of using Criteria for Classifying Designs of MSP Evaluations (“the rubric”) created by Westat through the Data Quality Initiative (DQI) Rubric’s key criteria for a rigorous design Common issues with evaluations Recommendations for more rigorous evaluations

3 Apply rubric to ensure reliable results. Projects meeting rubric criteria provide more accurate determination of impact on teacher and student outcomes Two step “screening” process of final year MSP evaluations to identify evaluations with rigorous designs.

4 First, assess evaluation design. To “qualify”, final year evaluations need to use an experimental or quasi-experimental design with a comparison group Of the 183 projects in their final year during PP07, 37 had qualifying evaluations with complete data Programs that did not qualify often used one- group only pre-post studies, which cannot account for differences occurring in the absence of the program

5 Next, use rubric to assess implementation. Second step is to apply rubric to see whether design implemented with sufficient rigor Rubric comprises six criteria: 1.Equivalence of groups at baseline 2.Adequate sample size 3.Use of valid & reliable measurement instruments 4.Use of consistent data collection methods 5.Sufficient response and retention rates 6.Reporting of relevant statistics

6 Criterion 1 – Baseline Equivalence Study demonstrates no significant differences between treatment and comparison groups at baseline on variables related to the study’s key outcomes (needed for quasi- experimental studies only) Purpose of Criterion – helps rule out alternative explanations for differences between groups

7 Criterion 1 – Baseline Equivalence Common Issues: Baseline characteristics reported without a statistical test for differences Information critical for complete assessment of baseline equivalence (e.g., sample size, standard deviation) is missing Recommendations: Report key characteristics associated with outcomes for each group (e.g., pretest scores, teaching experience). ALWAYS report sample sizes. Test for group mean difference on key characteristics with appropriate statistic tests (e.g., chi-square for dichotomous variables, t-test for continuous variables) and report the test statistics (e.g., t-stat, p-value). Control for significant differences between groups in the statistical analyses if differences exist at baseline.

8 Criterion 2 – Sample Size Sample size is adequate, based on a power analysis using: – Significance level = 0.05 – Power = 0.8 – Minimum detectable effect informed by the literature or otherwise justified Alternatively, meet or exceed “rule of thumb” threshold sample sizes: – Teacher Outcomes: 12 schools or 60 teachers – Student Outcomes: 12 schools or 18 teachers or 130 students Purpose of Criterion – builds confidence in the results

9 Criterion 2 – Sample Size Common Issues: Sample and subgroup sizes not reported for all teacher and student outcomes Sample sizes reported inconsistently across project documents Recommendations: Always provide clear reporting of sample sizes for all groups and subgroups. Conduct a power analysis during the design stage and report results in evaluation. If you do not conduct a power analysis, ensure that you meet the threshold values.

10 Criterion 3 – Measurement Instruments Use existing instruments that have already been deemed valid and reliable to measure key outcomes, OR Create new instruments that are either: – Sufficiently tested with subjects comparable to the study sample and found to be valid and reliable, OR – Created using scales and items from pre-existing data collection instruments that have been validated and found to be reliable. Final instrument should include at least 10 items, and at least 70 percent of the items should be from the validated and reliable instruments. Purpose of Criterion – ensures that instruments used accurately capture the intended outcomes

11 Criterion 3 – Measurement Instruments Common Issues: Validity and reliability testing not reported for locally-developed instruments Results of validity or reliability testing on pre-existing instruments not reported Recommendations: Select instruments that have been shown to produce accurate and consistent scores in a population similar to yours. If creating an assessment for the project, test the new instrument’s validity and reliability with a group similar to your subjects and report results. When selecting item from an existing measure: – Describe previous work demonstrating that source produces valid, reliable scores; – Provide references that describe instrument’s reliability & validity; and, – Use full sub-scales where possible.

12 Criterion 4 – Data Collection Methods Methods, procedures, and timeframes used to collect the key outcome data from treatment and comparison groups are comparable Purpose of Criterion – limits possibility that observed differences can be attributed to factors besides the program, such as passage of time and differences in testing conditions

13 Criterion 4 – Data Collection Methods Common Issues: Little information provided about data collection or process used to collect data from the treatment group only described Data collected from groups at different times or not collected systematically Recommendations: Document and describe the data collection procedures. Make every effort to collect data from the treatment and comparison groups for every outcome evaluated. If data cannot be collected from all members of both groups, consider randomly selecting a subset from each group (e.g., 10 participating teachers and 10 non- participating teachers).

14 Criterion 5 – Attrition Need to measure key outcomes for at least 70% of original sample (both treatment and control groups). If the attrition rates between groups equal or exceed15 percentage points, difference should be accounted for it in the statistical analysis Purpose of Criterion – helps ensure that sample attrition does not bias results as participants/control group members who drop out may systematically differ from those who remain

15 Criterion 5 – Attrition Common Issues: Sample attrition rates not reported, or reported for treatment group only Initial sample sizes not reported for all groups so that attrition rates could not be calculated Recommendations: Report the number of units of assignment and analysis at the beginning and end of study. If reporting on sub-groups, indicate their pre & post sample sizes.

16 Criterion 6 – Relevant Statistics Reported Include treatment and comparison group post-test means and tests of significance for key outcomes OR, Provide sufficient information for calculation of statistical significance (e.g., mean, sample size, standard deviation/standard error) Purpose of Criterion – provides context for interpreting results, indicating where observed differences between groups are most likely larger than what chance alone might cause

17 Criterion 6 – Relevant Statistics Reported Common Issues: Incomplete information made it difficult to assess evaluations for statistical significance. Common data points missing included means, standard deviations/ standard errors, and sample sizes Recommendations: Report means, sample sizes, standard deviations/errors, for treatment and comparison groups on all key outcomes. Report results from appropriate significance testing of differences observed between groups. If using a regression model or ANOVA analysis, describe the model and indicate means and standard deviations/errors.

18 Group Discussion What challenges have you encountered in your efforts to evaluate the MSP project? How have you/might you overcome these obstacles? What has enabled you to increase the rigor of your evaluations? If you could start your evaluation anew, what would you do differently?

19