Class 11 -12 Chapters 5 & Elkins (1989). Threats to Statistical Conclusion Validity Are the observed relations among variables accurate? Power Unreliability.

Slides:



Advertisements
Similar presentations
Ch 8: Experimental Design Ch 9: Conducting Experiments
Advertisements

Copyright © Allyn & Bacon (2007) Single-Variable, Independent-Groups Designs Graziano and Raulin Research Methods: Chapter 10 This multimedia product and.
Heppner et al. Chap 12, 18 Hogg & Deffenbacher (1988)
Randomized Experimental Design
GROUP-LEVEL DESIGNS Chapter 9.
Experimental Research Designs
Reliability and Validity in Experimental Research ♣
Sampling and Experimental Control Goals of clinical research is to make generalizations beyond the individual studied to others with similar conditions.
Today Concepts underlying inferential statistics
Chapter 9 Experimental Research Gay, Mills, and Airasian
McGraw-Hill © 2006 The McGraw-Hill Companies, Inc. All rights reserved. Experimental Research Chapter Thirteen.
Experimental Research
Chapter 2 Research Methods. The Scientific Approach: A Search for Laws Empiricism: testing hypothesis Basic assumption: events are governed by some lawful.
Studying treatment of suicidal ideation & attempts: Designs, Statistical Analysis, and Methodological Considerations Jill M. Harkavy-Friedman, Ph.D.
Chapter 12 Inferential Statistics Gay, Mills, and Airasian
Click to edit Master subtitle style The Role of Attachment in brief group therapy for depression: An empirical study Dr Jo Wilson Professor Phil Richardson.
Efficacy of Exercise in Reducing Depressive Symptoms.
Experimental Design The Gold Standard?.
Class 13 and 14 Jacobson et al (1996) APA (2006)Evidence Based Practice in Psychology 1.
Chapter 2: The Research Enterprise in Psychology
Class 13 and 14 Jacobson et al (1996) APA (2006)Evidence Based Practice in Psychology 1.
Selecting a Research Design. Research Design Refers to the outline, plan, or strategy specifying the procedure to be used in answering research questions.
Research Design for Quantitative Studies
High Intensity Comparators: Active Psychotherapy Denise E. Wilfley, Andrea E. Kass, & Rachel P. Kolko Department of Psychiatry Washington University School.
Review Exam 1.
1)Test the effects of IV on DV 2)Protects against threats to internal validity Internal Validity – Control through Experimental Design Chapter 10 – Lecture.
The Effectiveness of Psychodynamic Therapy and Cognitive Behavior Therapy in the Treatment of Personality Disorders: A Meta-Analysis. By Falk Leichsenring,
PTP 560 Research Methods Week 6 Thomas Ruediger, PT.
Group Quantitative Designs First, let us consider how one chooses a design. There is no easy formula for choice of design. The choice of a design should.
URBDP 591 A Lecture 8: Experimental and Quasi-Experimental Design Objectives Basic Design Elements Experimental Designs Comparing Experimental Design Example.
URBDP 491 A Lecture 7: Research Approaches Objectives How to compare alternative approaches Experimental vs. non-experimental approaches Cross-sectional.
EVIDENCE ABOUT DIAGNOSTIC TESTS Min H. Huang, PT, PhD, NCS.
Types of Research and Designs This week and next week… Covering –Research classifications –Variables –Steps in Experimental Research –Validity –Research.
The COMBINE Study: Design and Methodology Stephanie S. O’Malley, Ph.D. for The COMBINE Study Research Group JAMA Vol. 295, , 2006 (May 3 rd.
Estimating and Understanding Therapist Effects
Educational Research Chapter 13 Inferential Statistics Gay, Mills, and Airasian 10 th Edition.
Heppner et al. Chap 12, 18 Hogg & Deffenbacher (1988)
One-Way Analysis of Covariance (ANCOVA)
Chapter 10 Experimental Research Gay, Mills, and Airasian 10th Edition
Experimental Research Methods in Language Learning Chapter 5 Validity in Experimental Research.
Chapter 8 – Lecture 6. Hypothesis Question Initial Idea (0ften Vague) Initial ObservationsSearch Existing Lit. Statement of the problem Operational definition.
Chapter 11.  The general plan for carrying out a study where the independent variable is changed  Determines the internal validity  Should provide.
Chapter 10 Copyright © Allyn & Bacon 2008 This multimedia product and its contents are protected under copyright law. The following are prohibited by law:
Class 3 Between Group Designs, Expo Facto Designs, Status Variable Chapters HWK 7 ( ),10 ( , 243-4) 12(308) ANOVA PDF.
Chapter Eight: Quantitative Methods
Educational Research Inferential Statistics Chapter th Chapter 12- 8th Gay and Airasian.
Class 11 and 12 Jacobson et al (1996) Spring
Educational Research Experimental Research Chapter 9 (8 th Edition) Chapter 13 (7 th Edition) Gay and Airasian.
Chapter 6 Conducting Research in Clinical Psychology.
Chapter 9 Scrutinizing Quantitative Research Design.
Experimental Research Designs. Experimental Design Advantages  Best establishes cause-and-effect relationships Disadvantages  Artificiality of experiments.
Experimental and Quasi-Experimental Research
Service-related research: Therapy outcomes audit
Experimental Research Designs
Randomized Trials: A Brief Overview
Single-Variable, Independent-Groups Designs
12 Inferential Analysis.
Class 4 Experimental Studies: Validity Issues Reliability of Instruments Chapters 7 Spring 2017.
Internal Validity – Control through
Experimental Studies Heppner et al. (2015) Chap20
Class 4 Experimental Studies: Validity Issues Reliability of Instruments Chapters 7 Spring 2017.
2 independent Groups Graziano & Raulin (1997).
Between-Subjects Experimental Designs
Experimental Design.
Experimental Design.
The Nonexperimental and Quasi-Experimental Strategies
12 Inferential Analysis.
Chapter 11 EDPR 7521 Dr. Kakali Bhattacharya
Understanding Statistical Inferences
Misc Internal Validity Scenarios External Validity Construct Validity
Presentation transcript:

Class Chapters 5 & Elkins (1989)

Threats to Statistical Conclusion Validity Are the observed relations among variables accurate? Power Unreliability of Measures Introduces error variance Attenuates Correlations Unreliability of Treatment Implementation Specificity- Active ingredients Fidelity of delivery Competency Extraneous Variance in the Experimental Setting Heterogeneity of Participants 2

Threats to Internal Validity Can we conclude that there is a causal relation between the IV and the DV? Did treatment cause differences in DV across groups? Selection Inclusion –Exclusion criteria & Who gets assigned to which group? History Attrition What do we know about drop-outs? Repeated Testing Effects Reaction to Control Group Assignment Double- blind designs pharmaceutical studies Placebo effects – non-specific-factors vs active ingredient are responsible for observations Houston study  3

Department of Veterans Affairs (VA) and Baylor College of Medicine- Houston 180 osteoarthritis and knee pain patients randomly assigned to (New England J of Medicine, 2002): Debridementworn, torn,cartilage is cut and removed with viewing tube called an arthroscope Arthroscopic lavage bad cartilage is flushed out Simulated arthroscopic Surgery small incisions were made, but no instruments were inserted and no cartilage removed 4

Findings During two years of follow-up:, patients in all three groups reported moderate improvements in pain and ability to function. intervention groupsdid not report not less pain or better function than the placebo group. Placebo patients reported better outcomes than the debridement patients at certain points during follow-up. Patients were blind to type of surgery 5

Threats to Construct Validity To what extent variables capture desired constructs Mono-Operation Bias (Instruments) Mono-Method Bias Self-Report Clinician ragted Experimenter Expectancies Allegiance Effect 6

Threats to External Validity Can we generalize observed relations across persons, settings and times Person-Units Outcome Measures Settings 7

Elkin et al: Purpose Test feasibility of the collaborative clinical trial model Examine relative efficacy of CBT, IPT, and Medication for Depression 8

NIMH Treatment of Depression Collaborative Research Program U. of Pittsburg George Washington U. U. of Oklahoma 250 Patients: Major depressive disorder 28 therapists: years experience 2 -27; 71% male 10 psychologists 18 psychiatrists 9

10 Experimental Between-Group Designs 1. Post-Test Only Control 2. Pre-Test -- Post-Test Control 3. Solomon Four Group (combination of 1 and 2 above) Factorial Design more than one independent variable; interactions treatment X therapist or patient characteristic Dependent Sample Design (Matching)

11 Experimental Between-Group Designs 1. Post-Test Only Control 2. Pre-Test -- Post-Test Control 3. Solomon Four Group (combination of 1 and 2 above) Factorial Design - Post Hoc more than one independent variable; interactions treatment X patient characteristic (depression level at intake) Dependent Sample Design (Matching)

12 IVs: Experimental Groups: Cognitive Behavioral Therapy Interpersonal Therapy 16 individual sessions/ 50 min. Medication + Clinical Management* Pill-Placebo + Clinical Management* 1 st session 55 min.; then 20 to 25 min. * Minimal supportive therapy condition

Dependent Variables Clinical Evaluator Self Report 13

Dependent Variables Clinical Evaluator Hamilton Rating Scale Depression (HRSD) Global Assessment Scale (GAS) Self Report Beck Depression Inventory (BDI ) Hopkins Symptom Checklist (HSCL-90) 14

Outcome Research Strategies Primary Analyses Secondary Analyses (Post-Hoc) 15

1. Treatment Package Strategy 2. Dismantling Strategy 3. Constructive Strategy 4. Parametric Strategy (structural components) 5. Comparative Outcome Strategy 6. Client and Therapist Variation Strategy Moderation Designs

Outcome Research Strategies Primary Analyses Treatment package Comparative Secondary Analyses Client Variation - moderation effect ? 17

Outcome Research Strategies Secondary Analyses Client Variation - moderation effect depression level at intake as moderator differences between in outcomes treatment groups Were outcomes across treatment groups different for patients with higher versus lower levels of depression at pre-test? 18

Control Groups CBT IPT Medication + Clinical Management* Pill- Placebo + Clinical Management* * Minimal supportive therapy condition 19

Treatments & Therapists Cognitive Behavioral Therapy Interpersonal Therapy Different group of experienced therapists Medication + Clinical Mngmnt Pill-Placebo + Clinical Mngmnt Same therapists - psychiatrists 20

Treatments & Therapists Cognitive Behavioral Therapy Interpersonal Therapy Different group of experienced therapists (potential confound) Medication + Clinical Mngmnt Pill-Placebo + Clinical Mngmnt Same therapists: psychiatrists ( safeguards internal validity- undermines generalizability) 21

Ensure Valid Treatments Specify the treatment(s) Therapist training/monitoring Fidelity Checks 22

Ensure Valid Treatments Specify the treatment(s) Manuals Therapist training/monitoring Fidelity Checks- therapy tapes Collaborative Study Psychotherapy Rating Scale (CSPRS): Taped treatments could be discriminated 95% of the time 23

Attrition (>15 sessions or 12 weeks) Total: 77/23932% CBT32% IPT23% Meds/CM33% Placebo/CM40% Early terminators more depressed at pre-test than completers. 24

Which group to use in outcome analysis?? Total N = 239 Completers N = weeks or 12 sessions End-Point N = 204 At least 3.5 weeks or 4 sessions End Point N = 239 Intent to Treat Group (last assessment or pre-test) 25

Assessment Times Pre treatment Post Treatment 4, 8, 12 weeks Termination – 15 weeks Follow up: 6, 12, 18 months 26

27 Analyses of Pre-test/Post-test (1) Paired T-Test to examine differences between pre-test and post-test scores (p. 974) How Many ??

Table 1 Completer Group: At least 12 sessions; n=155 (page 975) 28

29 Analyses of Pre-test/Post-test (1) Paired T-Test to examine differences between pre-test and post-test scores (p. 974) How Many ?? 4 Treatment groups X 4 Outcome measures CBT HRSD IPT GAS IMI-CM BDI Pla-CM HSCL-90 X 3 Samples – Completers; End Point 204; 239

Findings – T-Tests 30 P.974 right

31 IVs: Experimental Groups: Cognitive Behavioral Therapy Interpersonal Therapy 16 individual sessions/ 50 min. Medication + Clinical Management* Pill-Placebo + Clinical Management* 1 st session 55 min.; then 20 to 25 min. * Minimal supportive therapy condition

Analyses of Post-test scores Use pre-test as a covariate in analyses of co- variance to compare mean post-test scores across the 4 treatment groups Calculate a residualized change score – amount of variability in the post-test that is not associated with the pre-test score Used a p<.10 in ANCOVAS and p =.10/6 =.01666=.017 pair-wise comparisons(6) Bonferroni correction (p.974) 32

Table 1 Completer Group: At least 12 sessions; n=155 (page 975) 33

34 ANCOVAS: Post test scores Statistically significant differences between groups in scales at post-test Four 3 X 4 ANCOVAS: differences across treatments in Post-treatment scores in: HRSD, GAS --- BDI, HSCL90 3 (sites) X 4 (treatment groups) Analyses reported only for treatment groups combining them across sites

Co-Variates Pre-test scores Marriage Status (1,2) Why not MANCOVAS? P

Table 1 Completer Group: At least 12 sessions; n=155 (page 975) p< BDI -No significance differences in pair-wise comparisons

Table 1 End Point 239 Group CBT IPT IMI-CM PLA-CM p<.10 37

Findings Pair-wise Comparisons SampleClinical Evaluator Self-Report Completer N = 155 BDI Pairwise NS HSCL-90-T p=.006 IMI-CM < PLA-CM EP-204 GAS IMI-CM < PLA-CM (trend p= ) EP-239 HRSDep IPT, IMI-CM < PLA-CM GAS p =.010 IMI-CM < PLA-CM (trend p=.017,.018) 38

39 Measuring Change Elkin et al Statistical significance Clinical significance Recovery Analysis

40 Measuring Change Elkin et al Statistical significance Differences between groups in scales at post-test controlling for pre-test scores Clinical significance Percentage of participants that changed from dysfunctional to functional level (using cut-off scores)

Clinical Significance Recovery Analyses Cut Off Scores Not Depressed HRSD and BDI Depressed HRSD and BDI Statistical Analyses 41

Clinical Significance Recovery Analysis Proportion of patients who improved vs. not improved Cut Off Scores Not Depressed HRSD < 6 and BDI < 9 Depressed HRSD > 6 or BDI > 9 Statistical Analyses Chi square: Proportion of depressed and non- depressed patients across treatment groups at termination. 42

43

44 End Point 239 HRSD p =.04 CBT IPT IMI-CM P-CM Chi Square (Χ 2 ) tests to what extent the proportion in each group is what may be expected by chance or if it is larger or smaller than expected……. IPT = IMI-CM>Placebo-CM CBT - % comparison was not sig. for any group Proportion of cases that met recovery criteria 36% (ns) Proportion of cases that met recovery criteria 43% Proportion of cases that met recovery criteria 42% Proportion of cases that met recovery criteria 21%

45 Completer Group on HRSD CBT IPT IMI-CM P-CM Chi Square (Χ 2 ) tests to what extent the proportion in each group is what may be expected by chance or if it is larger or smaller than expected……. IPT, IMI-CM>Placebo-CM Proportion of cases that met recovery criteria 51% Proportion of cases that met recovery criteria 55% Proportion of cases that met recovery criteria 57% Proportion of cases that met recovery criteria 29%

Secondary Analyses To examine effect of pre-treatment severity (HRSD/GAS) on outcome by treatment group DVs: Post-treatment scores Severity Criteria HRSD>20 44% of sample GAS<50 41% Covariate Marital Status 46

2X4 ANCOVA (severity x treatment) DVs- Post Test HRSD, GAS, BDI, HSCL-90 Main Effect for (Interaction term)*** 47

2X4 ANCOVA (severity x treatment) DVs- Post Test HRSD, GAS, BDI, HSCL-90 Main Effect for Severity More Severe Pre-Test HRSD>20; GAS<50 Less Severe Pre-Test Main Effect for Treatment CBT IPT IMI-CM P-CM Severity X Treatment (interaction term)******* 48

Interaction Effect HRSD Severity x TG Dependent Variables: HRSD* GAS, BDI, HSCL-90 (p.976) Completer S BDIIPTIMI-CMP-CM High Depression Low Depression 49 Completer S BDIIPTIMI-CMP-CM High Depression Low Depression Completer* CBTIPTIMI-CMP-CM High HRSD Low HRSD End Point 239^ CBTIPTIMI-CMP-CM High HRSD Low HRSD 4 sets of 3 2X4 Ancovas: 4DVs, 3 sample subgroups *p<.10; ^p<.11 End Point 204* CBTIPTIMI-CMP-CM High HRSD Low HRSD

Interaction Effect GAS Severity x TG : Dependent Variables: HRSD GAS, BDI, HSCL-90 Completer S BDIIPTIMI-CMP-CM High Depression Low Depression 50 Completer S BDIIPTIMI-CMP-CM High Depression Low Depression Completer** CBTIPTIMI-CMP-CM High GAS Low GAS End Point 239 * CBTIPTIMI-CMP-CM High GAS Low GAS End Point 204**** CBTIPTIMI-CMP-CM High GAS Low GAS

51 Treatment by Severity Interaction/end-point 204 sample Higher score Negative Outcome Higher Score Positive Outcome

Summary All Pairwise analyses following interaction effects p.976 Less severe groups: no differences across treatment groups More severe groups IPT more effective than PLA-CM in 3 instances all in the HRSD measure in the END Point Sample 204 (3 out of 4 comparisons) IMI-CM more effective than PLA-CM across a number of measures (8 out 10 comparisons) 52

53 Figure 2 Recovery Rates (%) endpoint /204 sample

Figure 2 Recovery Rates (%) endpoint /204 sample for severity groups (p.977) Less severe subgroups: NS differences among treatments for all samples with HRSD or GAS. More severe subgroups for HRSD and GAS: Consistent findings across the three samples IPT>PLA-CM 5/6 and IMI-CM>PLA-CM 6/6 54

Threats to Statistical Conclusion Validity Are the observed relations among variables accurate? Power Unreliability of Measures Unreliability of Treatment Implementation Extraneous Variance in the Experimental Setting Heterogeneity of Participants 55

Threats to Statistical Conclusion Validity Are the observed relations among variables accurate? Power Large N by group range Outcome measures are well-known + Power analyses 81-95% for medium effects + p<.10 for Mancovas and.10/6 for pairwise comp Unreliability of Measures Unreliability of Treatment Implementation Experienced Therapists – 2-27yrs Mean = 11 + Manuals, training per treatment group + Closely monitored + Taped sessions – 95% correctly classified + Extraneous Variance in the Experimental Setting Not known for the most part - 28 therapists from 3 – 11 patients each - no way to control for therapist effects P one site CBT another site IPT similar to Meds/CM Heterogeneity of Participants Random assignment to groups + Only included 45% of those screened. + Mostly women 70% female + 89% white participants + 56

Threats to Internal Validity Can we conclude that there is a causal relation between the IV and the DV? Did treatment cause differences in DV across groups? Selection Who gets assigned to which group? History Attrition What do we know about drop-outs? Repeated Testing Effects Reaction to Control Group Assignment 57

Threats to Internal Validity Can we conclude that there is a causal relation between the IV and the DV? Selection Used Randomization- See factors under Heterogeneity of Participants History Time frame of study not reported Did therapy happen at about the same time for everyone? Attrition Relatively high attrition rates - 32% -- about 25% was for negative reasons related to treatment- (-) Early terminators were more depressed at intake (-) Repeated Testing Effects Tested at frequent intervals –’ pre-test, 4, 8, 12, weeks, termination 6 12 and 18 months follow-up Reaction to Control Group Assignment Not known – but could be the case. Placebo/CM experienced the highest attrition – 32% CBT—23% IPT – 33% Meds/CM -- 40% Placebo/CM 58

Threats to Construct Validity To what extent variables capture desired constructs Mono-Operation Bias (Instruments) Mono-Method Bias Experimenter Expectancies 59

Threats to Construct Validity To what extent variables capture desired constructs Mono-Operation Bias (Instruments) Used 4 different outcome measures HIRSD, BDI, GAS, HCSL-90 + Measures of well-known psychometric properties + Mono-Method Bias Used both patient self report and clinician completed measures + Measures of well-known psychometric properties + Experimenter Expectancies Clinicians not blind to therapy modality- Psychiatrist blind to Med condition + 60

Threats to External Validity Can we generalize observed relations across persons, settings and times Person-Units Outcome Measures Settings 61

Threats to External Validity Can we generalize observed relations across persons, settings and times Person-Units Highly selected sample (-) Only 45% screened were selected (-) Generalizable to white (89%) women (70%) highly educated (75% coll degree or some coll) who were less severely depressed (p.974) Outcome Measures Interview and self –report measures + Clinical significance recovery rates + Statistically significant findings were not consistent across measures – HRSD detected more differences in depression that BDI - Settings Empirical Question ???? 62

63 Results: Summary 1/3 Paired T test showed stat. sig. differences (p<.001) in Pre- Post scores in all measures for all three groups of participants (even placebo pill/CM) Intent-to treat Completers Minimum 3.5<Sessions Completers of all or most sessions At least (n=155)

Results: Summary 2/3 ANCOVAS showed no stat sig differences in pre-test scores in any measure for any treatment group Stat sig differences in post-test BDI/HSCL90 Completers HSRD/GAS Total Group (239)

65 Results: Summary 3/3 Pairwise Follow-up ANCOVA HSCL-90 IMI-CM> PLA-CM (Completer) GAS -- IMI-CM>PLA-CM (Total 239 group) HRSD IPT, IMI-CM> trend PLA-CM ( Total 239) Recovery Findings (Clinical Significance) IPT, IMI-CM > PLA-CM ( End-Point 239) 43% 42% 21% Post-test HRSD<6 CBT = 36% NS