The Seven Deadly Sins of Program Evaluation William Ashton, Ph.D.
This Talk is for … Everyone -- Especially ATOD Professionals Some Experience with Program Evaluation Cookbook Look Out For Solutions Hot Topic -- Outcomes One Step Beyond
A brief quote ... “Clearly, evaluation can evoke strong emotions, negative associations, and genuine fear.” -- Michael Q. Patton
Alternate Title A Psych Geek Talks about Boring & Technical Research Methodology & Program Evaluation Stuff
Seven Deadly Sins of Program Evaluation Please See Insert 1
A History of Evaluation Process Data documenting services delivered (e.g. clients seen, talks given, participants at talks) Outcome Data documenting changes for populations receiving services (e.g. increase in family cohesion, increase in knowledge of drug refusal skills) Effects documenting changes for populations receiving services that are due to the program -- and only the program
Counterfactuals Jim Fixx -- outcome: died while jogging at 51 Did My Program Make a Difference Compared to What? Counterfactual -- should have beens Jim Fixx -- outcome: died while jogging at 51 Counterfactual: ‘should have’ died when he was 40 What difference did jogging make? jogging had a life-lengthening effect
Another Counterfactual Example Program -- 1999-2000 school year you implement an anti-smoking program for eight-graders Outcome -- Number of eight-grade tobacco violations drops from 1998-1999 to 1999-2000. Did your smoking program work … or ... Counterfactual -- principal shifts school’s enforcement focus away from tobacco to weapons & threats in 1999-2000-- violations would have dropped anyway!
The 7 Deadly Sins are ... 1. Using Bad Measures 2. Underestimating Regression 3. Underestimating Maturation 4. Underestimating Testing Effects 5. Underestimating Local History 6. Selected Groups 7. Using Bad Statistics
Deadly Sin #1 Bad Measures Tests Archival Data Surveys questionnaires data you get from someone else published survey data, school records
BAD Tests Look Out For Solutions: Use published (standardized) tests Homemade tests Solutions: Use published (standardized) tests Look For -- Internal Consistency (Reliability), a test’s ability to measure the trait and not error. (Cronbach) a > .72 Look For -- test-retest reliability, a test’s ability to measure the same trait twice. r > .70
BAD Archival Data Archival Data -- data you get from someone else Examples: number of eight-grade ATOD violations number of high school tobacco violations juvenile court referrals police report on gang activity office referrals at Ensley Aveune High School
BAD Archival Data Problem: Are the same procedures being used -- year to year -- to record data Examples: principal shifts school’s enforcement focus away from tobacco to weapons & threats new school secretary records many tobacco violations as “other drug” violations harddrive crashes -- all ATOD&V data from Ensley Avenue High School is lost for 1999.
BAD Archival Data Look Out For Solutions All archive data Sherlock Holmes approach Look For changes in policy changes in personnel read the find print
Pair off and ... Describe your program Describe how you could use bad measures in your program. Have your partner do the same Five minutes total
Seven Deadly Sins of Program Evaluation Please See Insert 2
Deadly Sin #2 Underestimating Regression When measuring the same thing twice extreme scores will become less extreme for no real reason Look Out For Giving a person the same test twice Forming groups based upon a pre-test score.
Don’t Form Groups Based upon Pre-Test Scores
Don’t Form Extreme Groups Please See Insert 3
Deadly Sin #2 Solution Don’t Form Extreme Groups Form groups based upon random assignment flip a coin!
Deadly Sin #3 Underestimating Maturation Participants “Grow Up” between pre-test and post-test Example: Behavioral and Emotional Rating Scale’s Interpersonal Strength subscale shows an increase between the pre-test (beginning of ninth grade) and post-test (end of ninth grade) Effect of program which targeted individual risk factors … or ... normal growth in Interpersonal Strength during the first year of high school?
Maturation Look Out For Solution Long test-retest intervals (some tests list acceptable intervals) test-retest intervals during “growth spurts” Solution Avoid above warning signs Use a control group
Deadly Sin #4 Underestimating Testing Effects Pre-test influences both behavior and/or responses on the post-test. Example: IQ Tests Pre-test ® Refusal Skill Training ® Post-test Is the positive outcome on the post-test caused by the training or the pre-test?
Testing Look Out For Solutions Obvious (Transparent) tests Highly Inflammatory (Reactive) tests Solutions Avoid warning signs Use a control group
Deadly Sin #5 Underestimating “Local History” other non-treatment event influences treatment group Example 80% of FAST families evicted during FAST program School-wide anti-drug curriculum Drug-related death at school
Local History Look Out For Solution Single group-- pre/post-test designs Solution Sherlock Holmes Approach Use a control group
In groups of four … Find new people Form a group of four Describe your program Each person describe how either a regression, maturation, testing or local history sin could effect your program Help out your partners! Ten to Fifteen Minutes
Seven Deadly Sins of Program Evaluation Please See Insert 4
Control Groups Participants are randomly assigned to either the control or treatment group Control group is given tests, but not the treatment This creates a counterfactual
Control Group Design Control Group Random Assignment Pre-test 8 Weeks Nothing Post-test Treatment Group Random Assignment Pre-test 8 Week FAST Curriculum Post-test
Control Group Design Random Control/Treatment Design Eliminates Regression Maturation Testing Local History
Deadly Sin #6 Selected Groups Instead of Randomly Forming groups … Participants get to choose which group to join Groups formed by a criterion FAST - teachers identify children most likely to benefit Regression
Selected Groups Look Out For Solution Participants Choosing Participants Being Selected Solution Random Assignment to control and treatment groups
Deadly Sin #7 Bad Statistics Conducting Multiple Statistical Tests Conducting Statistical Tests on Small Samples
Conducting Multiple Statistical Tests … or begging for a Type I Statistical Error Look Out For Conducting several t-tests or chi-squared tests Solution Find a statistician
Finding A Statistician Local College Psychology, Sociology or Math Department Professor Class Project Senior Thesis Remember College Time-Line
Conducting Statistical Tests on Small Samples … or begging for a Type II Statistical Error Look Out For groups with less than 15 persons Solution Don’t do statistics Find a statistician Get more people
Find a new partner and ... Describe your program Discuss how you would use a random control/treatment group design What problems would you encounter trying to randomly assign participates to control versus treatment groups?
Inspirational Quote “Bad data is free. Good data costs money.” -- Bill Ashton
The Cost of Evaluation Does your funder require Effects Evaluation? Yes, then get evaluation money from funder No, then ask yourself, “do I need to do this?” Will evaluation increase your chances of getting new funding? Yes, then find funding for evaluation and accept risk
Rights of Use of This Material Some trainers are very protective of their materials – they’re afraid that they’re giving away their business. I feel that freely distributing information like this is just good advertising for a trainer or consultant. So please use my material as you see fit; with the provision that you, in print, reference me. Please use the following information – in full: William Ashton, Ph.D. The City University of New York, York College Department of Political Science and Psychology www.york.cuny.edu/~washton