3 of your friends go to Frizzie’s Beauty Parlor FRIZZIE’S Beauty Parlor 2 of them emerge BALD! You decide that you will never go to Frizzie’s
FBN AUTOS 4 of your friends buy cars at FBN (Fly By Night) Autos 2 of them are lemons You decide that you will never buy a car from FBN
5 of your friends study math
A A AA 4 of them get A’s You decide to study math
3 of your friends ski Mount Killerdare 2 of them break legs You decide that you will never ski Mount Killerdare.
12 of your friends invest in Midas International You decide to invest in Midas International 9 of them become millionaires
If 10 sick people are treated with an experimental drug and 8 of them are cured, doctors might reasonably decide that the drug is effective. If 800 out of 1000 patients improve, they make the same decision with a higher degree of confidence. Statistical results are more significant if the sample is large. Perhaps we didn’t give Frizzie’s a fair chance. Realistically we make decisions with whatever data we have, and none of these decisions is foolproof! Sometimes we use probability theory to decide between two products, or two treatments.
NOW PLAYING: ELF NOW PLAYING: LOVE ACTUALLY 6 OF YOUR FRIENDS GO TO SEE ELF 5 OF YOUR FRIENDS GO TO SEE LOVE ACTUALLY
NOW PLAYING: ELF NOW PLAYING: LOVE ACTUALLY 4 OUT OF 6 = 67% liked this movie 1 OUT OF 5 = 20% liked this movie
This very reasonable model for decision making is not perfect. None of our models are. Simpson’s paradox can sometimes arise when the data in a study is subdivided into smaller categories. In any model, it is possible to oversimplify - thereby ignoring important information. Let’s consider several examples of Simpson’s paradox.
SIMPSON’S PARADOX
Simpson’s Paradox A and B represent two treatments for a terrible disease. In the following study 26 people with this disease were treated. Some chose treatment A, others chose B. The following data was collected. 1 man treated with A recovered. 4 men treated with A did not recover. 4 women treated with A recovered. 5 women treated with A did not recover. 3 men treated with B recovered. 7 men treated with B did not recover. 1 woman treated with B recovered. 1 woman treated with B did not recover.
Treatment ATreatment B Cure rate = 1/5=20% Cure rate=3/10=30% Cure rate=4/9=44% Cure rate=1/2=50% Treatment B has a higher cure rate for men AND Treatment B has a higher cure rate for women.
Treatment ATreatment B Cure rate = 1/5=20% Cure rate=3/10=30% Cure rate=4/9=44% Cure rate=1/2=50% 5 of the total number of 14 people treated with A survived. Cure rate=5/14=36% 4 of the total number of 12 people treated with B survived. Cure rate=4/12=33% B has a higher cure rate for men B has a higher cure rate for women But A has a higher cure rate overall !
A MOST INGENIOUS PARADOX
Plan J and Plan W are weight loss programs 4 of the 5 women on plan J lost weight 2 of the 9 men on plan J lost weight 7 of the 9 women on plan W lost weight 1 of the 5 men on plan W lost weight
DIET PLAN JDIET PLAN W Success rate = 4/5= 80% Success rate = 7/9=78% Success rate 2/9= 22% Success rate =1/5= 20%
DIET PLAN JDIET PLAN W Success rate = 4/5= 80% Success rate = 7/9=78% Success rate 2/9= 22% Success rate =1/5= 20% A total of 14 people used plan J 6 of the 14 were successful. Overall success rate = 43% A total of 14 people used plan W. 8 of the 14 were successful. Overall success rate = 57% J has a higher success rate for women J has a higher success rate for men But W has a higher success rate overall !
MONSTERPET and PETPLUMP are two different brands of pet food. Each is guaranteed to make your animal grow larger. The following study involves 11 cats and 11 dogs. 4 cats eat Monsterpet. 3 of them double in size; 1 remains puny. 7 cats eat Petplump. 5 of them double in size; 2 remain puny. 7 dogs eat Monsterpet. 2 of them double in size; 5 remain puny. 4 dogs eat Petplump. 1 of them doubles in size; 3 remain puny.
MONSTERPETPETPLUMP Success rate =3/4=75% Success rate =5/7=71% Success rate =1/4=25% Success rate =2/7=28%
MONSTERPETPETPLUMP Total success rate =5/11= 45% Total success rate =6/11= 54%
Overall, PETLUMP outperforms MONSTERPET 54% grow with Petplump while 45% grow with Monsterpet Perhaps we can unravel this paradox. Notice that with both brands of food, the cats respond much more than the dogs: Seventy some percent of cats grow with each product. Twenty some percent of dogs grow with each product. But a higher percentage of cats grow with Monsterpet: 75% compared to 71% And a higher percentage of dogs grow with Monsterpet: 28% compared to 25%
MONSTERPETPETPLUMP Because there are more cats on the Petplump side, the overall percentages for Petplump are higher
This puny creature would not deter a burglar. But feed her Monsterpet and see results MEOW
We are going to rework this example, turning the animals into people participating in a drug trial. Monsterpet becomes drug A Petplump becomes drug B The animals who responded favorably to the petfoods by growing large will become people who were cured by the drug. The small animals who did not respond become people who were not cured.
DRUG ADRUG B Cure rate = 5/11 = 45%Cure rate = 6/11 = 54%
Drug B appears to be more effective. Now, suppose the people who used to be cats share a genetic trait that predisposes them to respond more favorably to treatment. Suppose this trait is invisible. Drug A really works better for both groups of people - those with the trait and those without the trait. But, because more of the people with the trait are assigned to the sample being treated with B, it makes B look better.
DRUG ADRUG B 3/4=75% cure rate 5/7=71% cure rate 2/7=28% cure rate 1/4=25% cure rate
A MOST INGENIOUS PARADOX