Download presentation
Presentation is loading. Please wait.
1
Don't Compare Averages Holger Bast Max-Planck-Institut für Informatik (MPII) Saarbrücken, Germany joint work with Ingmar Weber WEA 2005 May 10 – May 13, Santorini Island, Greece
2
Two famous quotes There are three kinds of lies: lies, damn lies, and statistics Benjamin Disraeli, 1804 – 1881 (reported by Mark Twain) Never believe any statistics you haven‘t forged yourself Winston Churchill, 1874 – 1965
3
A typical figure Theirs Ours Each point represents an average over a number of iterations Y-axis: some cost measure X-axis: input size 3 4
4
Changing the cost measure... … by a monotone function, say from c to 2 c This is from authentic data! 3 4 c 10 15 2c2c
5
No deep mathematics here Even for strict monotone f –certainly E f(X) ≠ f(E X) in general –but also E X ≤ E Y does not in general imply E f(X) ≤ E f(Y) Example –X : 4, 4 → average 4 –Y : 1, 5 → average 3 –2 X : 2 4, 2 4 → average 16 –2 Y : 2 1, 2 5 → average 17
6
Examples of multiple cost measures Language modeling –for a given probability distribution p 1,…, p n –find distribution q 1,…, q n from a constrained class that minimizes cross-entropy Σ p i log (p i /q i ) minimizes perplexity π (p i /q i ) p i = 2 cross-entropy Algorithm A uses algorithm B as a subroutine –B produces result of average quality q –complexity of A depends on, say, q 2
7
Can this also happen with error bars? error bars for c don't overlap, yet reversal for f(c)? Yes, this can also happen! c f(c)
8
Can this also happen with error bars? complete reversal with error bars? c f(c)
9
Can this also happen with error bars? complete reversal with error bars? c f(c)
10
Can this also happen with error bars? complete reversal with error bars? E Y + δ Y E X – δ X E f(Y) – δ f(Y) E f(X) + δ f(X) c f(c) δ Z = E |Z – E Z| absolute deviation ≤ σ Z = sqrt E (Z – E Z) 2 standard deviation
11
Can this also happen with error bars? complete reversal with error bars? if E X – δ X ≥ E Y + δ Y c f(c) then E f(X) – δ f(X) ≥ E f(Y) + δ f(Y) Theorem: complete reversal can never happen!
12
Can this also happen with error bars? complete reversal with error bars? if E X – δ X ≥ E Y + δ Y c f(c) then E f(X) – δ f(X) ≥ E f(Y) + δ f(Y) if only one of the four δ is dropped, the theorem no longer holds in general
13
Our first proof
14
The canonical proof 1.The medians M X and M Y do commute with f … Prob(X ≤ M X) = ½ = Prob( f(X) ≤ f(M X) ) f(M X) = M f(X) and f(M Y) = M f(Y) 2.… and hence cannot reverse their order M X ≤ M Y → f(M X) ≤ f(M Y) because f is monotone → M f(X) ≤ M f(Y) because M and f commute 3.Expectation and median are related as | E X – M X | ≤ δ X = E | X – E X | | E Y – M Y | ≤ δ Y = E | Y – E Y | nothing new, but hardly any computer scientist seems to know
15
The canonical proof now assume this would happen contradicts the fact that the medians cannot reverse E Y + δ Y E X – δ X E f(Y) – δ f(Y) E f(X) + δ f(X) then M Y ≤ M Xyet M f(Y) > M f(X) c f(c)
16
Conclusion Average comparison is a deceptive thing –even with error bars! There are more effects of this kind … –e.g. non-overlapping error bars are not statistically significant for a particular order of the expectations (or medians) –e.g. for normally distributed X, Y Prob( X + δ X ≤ Y – δ Y | E X > E Y ) is up to 8% Better always look at the complete histogram and at least check maximum and minimum X Y
17
Ευχαριστώ! Conclusion Average comparison is a deceptive thing –even with error bars! There are more effects of this kind … –e.g. non-overlapping error bars are not statistically significant for a particular order of the expectations (or medians) –e.g. for normally distributed X, Y Prob( X + δ X ≤ Y – δ Y | E X > E Y ) is up to 8%
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.