Session 2: Specifying the Conceptual and Operational Models and the Research Questions that Follow Mark W. Lipsey Vanderbilt University IES/NCER Summer.

Slides:



Advertisements
Similar presentations
Chapter 22 Evaluating a Research Report Gay, Mills, and Airasian
Advertisements

Response to Intervention (RtI) in Primary Grades
Teacher In-Service August, Abraham Lincoln.
Standardized Scales.
Cross Cultural Research
Roger D. Goddard, Ph.D. March 21, Purposes Overview of Major Research Grants Programs Administered by IES; Particular Focus on the Education Research.
Learning Objectives, Performance Tasks and Rubrics: Demonstrating Understanding and Defining What Good Is Brenda Lyseng Minnesota State Colleges.
+ Evidence Based Practice University of Utah Presented by Will Backner December 2009 Training School Psychologists to be Experts in Evidence Based Practices.
1. 2 Research should clearly describe  The intervention  How the intervention differed from the control  How the intervention is supposed to affect.
Specifying the Conceptual and Operational Models and the Research Questions that Follow Mark W. Lipsey Vanderbilt University IES/NCER Summer Research Training.
MSc Applied Psychology PYM403 Research Methods Validity and Reliability in Research.
The Methods of Social Psychology
Formative and Summative Evaluations
PPA 502 – Program Evaluation
Experimental Research
Quantitative Research
Studying treatment of suicidal ideation & attempts: Designs, Statistical Analysis, and Methodological Considerations Jill M. Harkavy-Friedman, Ph.D.
FLCC knows a lot about assessment – J will send examples
Grant Writing Workshop for Efficacy and Replication Projects and Effectiveness Projects Hi, I’m Joan McLaughlin. Caroline Ebanks (from the National Center.
How to Develop the Right Research Questions for Program Evaluation
RESEARCH DESIGN.
But What Does It All Mean? Key Concepts for Getting the Most Out of Your Assessments Emily Moiduddin.
Community Input Discussions: Measuring the Progress of Young Children in Massachusetts August 2009.
Questionnaires and Interviews
Research and Evaluation Center Jeffrey A. Butts John Jay College of Criminal Justice City University of New York August 7, 2012 How Researchers Generate.
Overview of MSP Evaluation Rubric Gary Silverstein, Westat MSP Regional Conference San Francisco, February 13-15, 2008.
Moving from Development to Efficacy & Intervention Fidelity Topics National Center for Special Education Research Grantee Meeting: June 28, 2010.
ASSESSMENT IN EDUCATION ASSESSMENT IN EDUCATION. Copyright Keith Morrison, 2004 PERFORMANCE ASSESSMENT... Concerns direct reality rather than disconnected.
Classroom Assessments Checklists, Rating Scales, and Rubrics
Comp 20 - Training & Instructional Design Unit 6 - Assessment This material was developed by Columbia University, funded by the Department of Health and.
Evaluating a Research Report
Overview of Evaluation Designs. Learning objectives By the end of this presentation, you will be able to: Explain evaluation design Describe the differences.
The present publication was developed under grant X from the U.S. Department of Education, Office of Special Education Programs. The views.
KATEWINTEREVALUATION.com Education Research 101 A Beginner’s Guide for S STEM Principal Investigators.
Assisting GPRA Report for MSP Xiaodong Zhang, Westat MSP Regional Conference Miami, January 7-9, 2008.
LECTURE 2 EPSY 642 META ANALYSIS FALL CONCEPTS AND OPERATIONS CONCEPTUAL DEFINITIONS: HOW ARE VARIABLES DEFINED? Variables are operationally defined.
Quantitative and Qualitative Approaches
Performance and Portfolio Assessment. Performance Assessment An assessment in which the teacher observes and makes a judgement about a student’s demonstration.
Supports K–12 School Effectiveness Framework: A Support for School Improvement and Student Success (2010). The integrated process of assessment and instruction.
SURVEY RESEARCH.  Purposes and general principles Survey research as a general approach for collecting descriptive data Surveys as data collection methods.
CHAPTER 12 Descriptive, Program Evaluation, and Advanced Methods.
Research in Communicative Disorders1 Research Design & Measurement Considerations (chap 3) Group Research Design Single Subject Design External Validity.
8. Observation Jin-Wan Seo, Professor Dept. of Public Administration, University of Incheon.
Observation and Assessment in Early Childhood Feel free to chat with each other. We will start class at 9:00 PM ET! Seminar Two: Using Standardized Tests.
Evaluation Designs Adrienne DiTommaso, MPA, CNCS Office of Research and Evaluation.
Alternative Assessment Chapter 8 David Goh. Factors Increasing Awareness and Development of Alternative Assessment Educational reform movement Goals 2000,
Changes in Professional licensure Teacher evaluation system Training at Coastal Carolina University.
Securing External Federal Funding Janice F. Almasi, Ph.D. Carol Lee Robertson Endowed Professor of Literacy University of Kentucky
Developing an evaluation of professional development Webinar #2: Going deeper into planning the design 1.
CE300-Observation and Assessment in Early Childhood Unit 2 Using Standardized Tests and Authentic Assessments Feel free to chat with each other. We will.
Open Forum: Scaling Up and Sustaining Interventions Moderator: Carol O'Donnell, NCER
Issues in Treatment Study Design John Whyte, MD, PhD Neuro-Cognitive Rehabilitation Research Network Moss Rehabilitation Research Institute.
Quality Evaluations in Education Interventions 1 March 2016 Dr Fatima Adam Zenex Foundation.
Best Practices in CMSD SLO Development A professional learning module for SLO developers and reviewers Copyright © 2015 American Institutes for Research.
Randomized Control Trials: What, Why, How, When, and Where Mark L. Davison MESI Conference March 11, 2016 Department of Educational Psychology.
Johan Mouton© February 2006 C Hart Exploratory questions What are the most important variable that have an effect on learner achievement? What happens.
Designing Quality Assessment and Rubrics
Stages of Research and Development
Pre-Referral to Special Education: Considerations
Goal 2/ Goal 3 In 2016, no Goal 2s accepted; 2017?
QUESTIONNAIRE DESIGN AND VALIDATION
Chapter Eight: Quantitative Methods
COMPETENCIES & STANDARDS
Mark W. Lipsey Vanderbilt University
Mark W. Lipsey Vanderbilt University
Mark W. Lipsey Vanderbilt University
Group Experimental Design
TESTING AND EVALUATION IN EDUCATION GA 3113 lecture 1
Mark W. Lipsey Vanderbilt University
Presentation transcript:

Session 2: Specifying the Conceptual and Operational Models and the Research Questions that Follow Mark W. Lipsey Vanderbilt University IES/NCER Summer Research Training Institute, 2007

Workshop on randomized controlled trials Purpose: Increasing capacity to develop and conduct rigorous evaluations of the effectiveness of education interventions Caveat: “Rigorous evaluations” are not appropriate for every intervention or every research project involving an intervention –They require special resources (funding, amenable circumstances, expertise, time) –They can produce misleading or uninformative results if not done well –The preconditions for making them meaningful may not be met.

Critical preconditions for rigorous evaluation A well-specified, fully developed intervention with useful scope –basis in theory and prior research –identified target population –specification of intended outcomes/effects –“theory of change” explication of what it does and why it should have the intended effects for the intended population –operators’ manual: complete instructions for implementing –ready-to-go materials, training procedures, software, etc.

Critical preconditions for rigorous evaluation (continued) A plausible rationale that the intervention is needed; reason to believe it has advantages over what’s currently proven and available Clarity about the relevant counterfactual– what it is supposed to be better than Demonstrated “implementability”– can be implemented well enough in practice to plausibly have effects Some evidence that it can produce the intended effects albeit short of standards for rigorous evaluation

Critical preconditions for rigorous evaluation (continued) Amenable research sites and circumstances: –cooperative schools, teachers, parents, and administrators willing to participate –student sample appropriate in terms of representativeness and size for showing educationally meaningful effects –access to students (e.g., for testing), records, classrooms (e.g., for observations)

IES funding categories Goal 2 (intervention development) for advancing intervention concepts to the point where rigorous evaluation of its effects may be justified Goal 3 (efficacy studies) for determining whether an intervention can produce worthwhile effects; RCT evaluations preferred. Goal 4 (effectiveness studies) for investigating the effects of an intervention implemented under realistic conditions at scale; RCT evaluations preferred.

Specifying the theory of change embodied in the intervention 1.Nature of the need addressed –what and for whom (e.g., 2 nd grade students who don’t read well) –why (e.g., poor decoding skills, limited vocabulary) –where the issues addressed fit in the developmental progression (e.g., prerequisites to fluency and comprehension, assumes concepts of print) –rationale/evidence supporting these specific intervention targets at this particular time

Specifying the theory of change 2.How the intervention addresses the need and why it should work –content: what the student should know or be able to do; why this meets the need –pedagogy: instructional techniques and methods to be used; why appropriate –delivery system: how the intervention will arrange to deliver the instruction Most important: What aspects of the above are different from the counterfactual condition What are the key factors or core ingredients most essential and distinctive to the intervention

Logic models as theory schematics 4 year old pre-K children Exposed to intervention Positive attitudes to school Improved pre-literacy skills Learn appropriate school behavior Increased school readiness Greater cognitive gains in K Target Population InterventionProximal OutcomesDistal Outcomes

Mapping variables onto the intervention theory: Sample characteristics 4 year old pre-K children Exposed to intervention Positive attitudes to school Improved pre-literacy skills Learn appropriate school behavior Increased school readiness Greater cognitive gains in K Sample descriptors: basic demographics diagnostic, need/eligibility identification nuisance factors (for variance control) Potential moderators: setting, context personal and family characteristics prior experience

Mapping variables onto the intervention theory: Intervention characteristics 4 year old pre-K children Exposed to intervention Positive attitudes to school Improved pre-literacy skills Learn appropriate school behavior Increased school readiness Greater cognitive gains in K Independent variable: T vs. C experimental condition Generic fidelity: T and C exposure to the generic aspects of the intervention (type, amount, quality) Specific fidelity: T and C(?) exposure to distinctive aspects of the intervention (type, amount, quality) Potential moderators: characteristics of personnel intervention setting, context e.g., class size

Mapping variables onto the intervention theory: Intervention outcomes 4 year old pre-K children Exposed to intervention Positive attitudes to school Improved pre-literacy skills Learn appropriate school behavior Increased school readiness Greater cognitive gains in K Focal dependent variables: pretests (pre-intervention) posttests (at end of intervention) follow-ups (lagged after end of intervention Other dependent variables: construct controls– related DVs not expected to be affected side effects– unplanned positive or negative outcomes mediators– DVs on causal pathways from intervention to other DVs

Main relationships of (possible) interest Causal relationship between IV and DVs (effects of causes); tested as T-C differences Duration of effects post-intervention; growth trajectories Moderator relationships; ATIs (aptitude-Tx interactions): differential T effects for different subgroups; tested as T x M interactions or T-C differences between subgroups Mediator relationships: stepwise causal relationship with effect on one DV causing effect on another; tested via Baron & Kenny (1986), SEM type techniques.

Formulation of the research questions Organized around key variables and relationships Specific with regard to the nature of the variables and relationships Supported with a rationale for why the question is important to answer Connected to real-world education issues What works, for whom, under what circumstances, how, and why?

Session 3: Describing and Quantifying Outcomes Mark W. Lipsey Vanderbilt University IES/NCER Summer Research Training Institute, 2007

Outcome constructs to measure Identifying the relevant outcome constructs follows from the theory development and other considerations covered earlier in Session 2 –What: proximal/mediating and distal outcomes –When: temporal status– baseline, immediate outcome, longer term outcomes –What else: possible positive or negative side effects construct control outcomes not targeted for change

Aligning the outcome constructs and measures with the intervention and policy objectives Instruction Assessment Policy relevant outcomes (e.g., state achievement standards)

Alignment of instructional tasks with the assessment tasks Identical Analogous (near transfer) Generalized (far transfer) Instructional tasks, activities, content

Basic psychometric issues Validity (typically correlation with established measures or subgroup differences) Reliability (typically internal consistency or test-retest correlation) –standardized measures of established validity and reliability –researcher developed measures with validity and reliability demonstrated in prior research –new measures with validity and/or reliability to be investigated in present study

Special issue for intervention studies: sensitivity to change

Achievement effect sizes from 97 randomized education studies Type of Outcome Measure Mean Effect Size Number of Measures Standardized test, broad.0929 Standardized test, narrow Focal topic test, mastery test.50263

Data from which measurement sensitivity can be inferred Observed effects from other intervention studies using the measure Mean effect sizes and their standard deviations from meta-analysis Longitudinal research and descriptive research showing change over time or differences between relevant criterion groups Archival data allowing ad hoc analysis of, e.g., change over time, differences between groups Pilot data on change over time or group differences with the measure

Variance control and measurement sensitivity Variance control via procedural consistency and statistical control using covariates for e.g., pre-intervention individual differences and differences in testing procedures or conditions

Issues related to multiple outcome measures

Correlated measures: overlap and efficiency Subtest Factor Loadings Pre-K Pretest Pre-K Posttest Kindergarten Follow-up Letter Word Identification Quantitative Concepts Applied Problems Picture Vocabulary Oral Comprehension Story Recall Factor Analysis of Preschool Outcome Variables

Correlated change may be even more relevant Subtest Factor Loadings Pre to Post Post to Follow-up Pre to Follow-up Basic School Skills Letter Word Identification Quantitative Concepts Applied Problems Complex Language Picture Vocabulary Oral Comprehension Story Recall Factor Analysis of Gain Scores for Pre-K Outcomes

Handling multiple correlated outcome measures Pruning– try to avoid measures that have high conceptual overlap and are likely to have relatively large intercorrelations Procedural– organize assessment and data collection to combine where possible for efficiency Analytic –create composite variables to use in the analysis –use multivariate techniques like MANOVA to examine omnibus effects as context for univariate effects –use latent variable analysis, e.g., in SEM

Practicality and appropriateness to the circumstances Feasibility– time and resources required Respondent burden– minimize demands, provide incentives/compensation Developmental appropriateness– consider not only age but performance level, possible ceiling and floor effect For follow-up beyond one school year, may need measures designed for a broad age span to maintain comparability May need to tailor measures or assessment procedures for special populations (disabilities, English language learners)