Presentation is loading. Please wait.

Presentation is loading. Please wait.

SIMPSON’S PARADOX Any statistical relationship between two variables may be reversed by including additional factors in the analysis. Application: The.

Similar presentations


Presentation on theme: "SIMPSON’S PARADOX Any statistical relationship between two variables may be reversed by including additional factors in the analysis. Application: The."— Presentation transcript:

1 SIMPSON’S PARADOX Any statistical relationship between two variables may be reversed by including additional factors in the analysis. Application: The adjustment problem Which factors should be included in the analysis. (Pearson et al. 1899; Yule 1903; Simpson 1951)

2 e.g., UC Berkeley's alleged sex bias in graduate admission (Science - 1975). Overall data showed a higher rate of admission among male applicants, but, broken down by departments, data showed a slight bias in favor of admitting female applicants. e.g., "reverse regression" (1970-80): Should one, in salary discrimination cases, compare salaries of equally qualified men and women, or, instead, compare qualifications of equally paid men and women. (Opposite conclusions.) Practical Dilemma: Why break down by department? How about by some other variable Z? Find Z such that P(y|do(x)) = ∑ z P(y|x,z)P(z) Solution: The back-door algorithm (Chapter 3). EXAMPLES OF SIMPSON’S REVERSAL

3

4 PEARSON’S SHOCK: “SPURIOUS CORRELATION” We are thus forced to the conclusion that a mixture of heterogeneous groups, each of which exhibits in itself no organic correlation, will exhibit a greater or less amount of correlation. This correlation may properly be called spurious, yet as it is almost impossible to guarantee the absolute homogeneity of any community, our results for correlation are always liable to an error, the amount of which cannot be foretold. To those who persist on looking upon all correlation as cause and effect, the fact that correlation can be produced between two quite uncorrelated characters A and B by taking an artificial mixture of two closely allied races, must come as rather a shock. [Pearson, Lee & Brandy-Moore (1899)] 1. Causation = perfect correlation 2. “Not all correlations are correlations” (Aldrich 1994)

5 SIMPSON’S PARADOX (1951 – 1994) M R R T 2 8 10 T 9 21 30 11 29 40 M R R T 18 12 30 T 7 3 10 25 15 40 R R T 20 20 40 T 16 24 40 36 44 80 T – TreatedT – Not treated R – RecoveredR – Dead M – MalesM – Females Easy question (1950-1994) When / why the reversal? Harder questions (1994) Is the treatment useful? Which table to consult? Why is Simpson’s reversal a paradox? +=

6 SIMPSON’S REVERSAL Pr(recovery | drug, male) > Pr(recovery | no-drug, male) Pr(recovery | drug, female) > Pr(recovery | no-drug, female) Group behavior: Overall behavior: Pr(recovery | drug) < Pr(recovery | no-drug)

7 TO ADJUST OR NOT TO ADJUST? Treatment Recovery Gender X Z Mediating factor Z Treatment X Y Recovery Y

8 TWO PROOFS: 1.Surprise surfaces only when we speak about “efficacy,” not about evidence for prediction. 2.When two causal models generate the same statistical data and In one we decide to use the drug yet in the other not to use it, our decision must be driven by causal and not by statistical considerations. Thus, there is no statistical criterion to warn us against consulting the wrong table. Q.Can Temporal information help? ANo!, see Figure 6.3 (c). THE INEVITABLE CONCLUSION: THE PARADOX STEMS FROM CAUSAL INTERPRETATION

9 In (c), F may occur before or after C, and the correct answer is to consult the combined table. In (d), may occur before or after C, and the correct answer is to consult the F-specific tables WHY TEMPORAL INFORMATION DOES NOT HELP Treatment Recovery Gender C F E Treatment Recovery Blood Pressure Treatment Recovery C F E C F E C F E (a)(b)(c)(d)

10 1. People think causes, not proportions. 2. "Reversal" is possible in the calculus of proportions but impossible in the calculus of causes. WHY SIMPSON’S PARADOX EVOKES SURPRISE

11 CAUSAL CALCULUS PROHIBITS REVERSAL do{drug} do{no-drug} Pr(recovery | drug) > Pr(recovery | no-drug) Pr (male | do{drug} ) = Pr (male | do{no-drug}) do{drug} do{no-drug} Pr(recovery | drug, male) > Pr(recovery | no-drug, male) Pr(recovery | drug, female) > Pr(recovery | no-drug, female) do{drug} do{no-drug} Group behavior: Overall behavior: Assumption:

12 THE SURE THING PRINCIPLE Theorem 6.1.1 An action C that increases the probability of an event E in each subpopulation must also increase the probability of E in the population as a whole, provided that the action does not change the distribution of the subpopulations.


Download ppt "SIMPSON’S PARADOX Any statistical relationship between two variables may be reversed by including additional factors in the analysis. Application: The."

Similar presentations


Ads by Google