Download presentation
Presentation is loading. Please wait.
Published byMaria Hood Modified over 8 years ago
1
© 1998, Geoff Kuenning Common Mistakes in Graphics Excess information Multiple scales Using symbols in place of text Poor scales Using lines incorrectly
2
© 1998, Geoff Kuenning Excess Information Sneaky trick to meet length limits Rules of thumb: –6 curves on line chart –10 bars on bar chart –8 slices on pie chart Extract essence, don’t cram things in
3
© 1998, Geoff Kuenning Way Too Much Information
4
© 1998, Geoff Kuenning What’s Important About That Chart? Times for cp and rcp rise with number of replicas Most other benchmarks are near constant Exactly constant for rm
5
© 1998, Geoff Kuenning The Right Amount of Information
6
True Confessions
7
© 1998, Geoff Kuenning Multiple Scales Another way to meet length limits Basically, two graphs overlaid on each other Confuses reader (which line goes with which scale?) Misstates relationships –Implies equality of magnitude that doesn’t exist
8
© 1998, Geoff Kuenning Some Especially Bad Multiple Scales
9
© 1998, Geoff Kuenning Using Symbols in Place of Text Graphics should be self-explanatory –Remember that the graphs often draw the reader in So use explanatory text, not symbols This means no Greek letters! –Unless your conference is in Athens...
10
© 1998, Geoff Kuenning It’s All Greek To Me...
11
© 1998, Geoff Kuenning Explanation is Easy
12
© 1998, Geoff Kuenning Poor Scales Plotting programs love non-zero origins –But people are used to zero Fiddle with axis ranges (and logarithms) to get your message across –But don’t lie or cheat Sometimes trimming off high ends makes things clearer –Brings out low-end detail
13
© 1998, Geoff Kuenning Nonzero Origins (Chosen by Microsoft)
14
© 1998, Geoff Kuenning Proper Origins
15
© 1998, Geoff Kuenning A Poor Axis Range
16
© 1998, Geoff Kuenning A Logarithmic Range
17
© 1998, Geoff Kuenning A Truncated Range
18
© 1998, Geoff Kuenning Using Lines Incorrectly Don’t connect points unless interpolation is meaningful Don’t smooth lines that are based on samples –Exception: fitted non-linear curves
19
© 1998, Geoff Kuenning Incorrect Line Usage
20
© 1998, Geoff Kuenning Pictorial Games Non-zero origins and broken scales Double-whammy graphs Omitting confidence intervals Scaling by height, not area Poor histogram cell size
21
© 1998, Geoff Kuenning Non-Zero Origins and Broken Scales People expect (0,0) origins –Subconsciously So non-zero origins are a great way to lie More common than not in popular press Also very common to cheat by omitting part of scale –“Really, Your Honor, I included (0,0)”
22
© 1998, Geoff Kuenning Non-Zero Origins
23
© 1998, Geoff Kuenning The Three-Quarters Rule Highest point should be 3/4 of scale or more
24
© 1998, Geoff Kuenning Double-Whammy Graphs Put two related measures on same graph –One is (almost) function of other Hits reader twice with same information –And thus overstates impact
25
© 1998, Geoff Kuenning Omitting Confidence Intervals Statistical data is inherently fuzzy But means appear precise Giving confidence intervals can make it clear there’s no real difference –So liars and fools leave them out
26
© 1998, Geoff Kuenning Graph Without Confidence Intervals
27
© 1998, Geoff Kuenning Graph With Confidence Intervals
28
Confidence Intervals Sample mean value is only an estimate of the true population mean Bounds c 1 and c 2 such that there is a high probability, 1- , that the population mean is in the interval (c 1,c 2 ): Prob{ c 1 < < c 2 } =1- where is the significance level and 100(1- ) is the confidence level Overlapping confidence intervals is interpreted as “not statistically different”
29
© 1998, Geoff Kuenning Graph With Confidence Intervals
30
Reporting Only One Run (tell-tale sign) Probably a fluke (It’s likely that with multiple trials this would go away)
31
© 1998, Geoff Kuenning Scaling by Height Instead of Area Clip art is popular with illustrators: Women in the Workforce 1960 1980
32
© 1998, Geoff Kuenning The Trouble with Height Scaling Previous graph had heights of 2:1 But people perceive areas, not heights –So areas should be what’s proportional to data Tufte defines a lie factor: size of effect in graphic divided by size of effect in data –Not limited to area scaling –But especially insidious there (quadratic effect)
33
© 1998, Geoff Kuenning Scaling by Area Here’s the same graph with 2:1 area: Women in the Workforce 1960 1980
34
© 1998, Geoff Kuenning Histogram Cell Size Picking bucket size is always a problem Prefer 5 or more observations per bucket Choice of bucket size can affect results:
35
Histogram Cell Size Picking bucket size is always a problem Prefer 5 or more observations per bucket Choice of bucket size can affect results:
36
Histogram Cell Size Picking bucket size is always a problem Prefer 5 or more observations per bucket Choice of bucket size can affect results:
37
© 1998, Geoff Kuenning Don’t Quote Data Out of Context
38
© 1998, Geoff Kuenning The Same Data in Context
39
Tell the Whole Truth
41
© 1998, Geoff Kuenning Special-Purpose Charts Histograms Scatter plots Gantt charts Kiviat graphs
42
© 1998, Geoff Kuenning Tukey’s Box Plot Shows range, median, quartiles all in one: Variations: minimummaximumquartile median
43
© 1998, Geoff Kuenning Histograms
44
© 1998, Geoff Kuenning Scatter Plots Useful in statistical analysis Also excellent for huge quantities of data –Can show patterns otherwise invisible
45
© 1998, Geoff Kuenning Better Scatter Plots Again, Tufte improves the standard –But it can be a pain with automated tools Can use modified Tukey box plot for axes
46
© 1998, Geoff Kuenning Gantt Charts Shows relative duration of Boolean conditions Arranged to make lines continuous –Each level after first follows FTTF pattern
47
© 1998, Geoff Kuenning Gantt Charts Shows relative duration of Boolean conditions Arranged to make lines continuous –Each level after first follows FTTF pattern T TT TTTT F FF FFFF
48
© 1998, Geoff Kuenning Kiviat Graphs Also called “star charts” or “radar plots” Useful for looking at balance between HB and LB metrics HB LB
49
© 1998, Geoff Kuenning Useful Reference Works Edward R. Tufte, The Visual Display of Quantitative Information, Graphics Press, Cheshire, Connecticut, 1983. Edward R. Tufte, Envisioning Information, Graphics Press, Cheshire, Connecticut, 1990. Edward R. Tufte, Visual Explanations, Graphics Press, Cheshire, Connecticut, 1997. Darrell Huff, How to Lie With Statistics, W.W. Norton & Co., New York, 1954
50
© 1998, Geoff Kuenning Ratio Games Choosing a Base System Using Ratio Metrics Relative Performance Enhancement Ratio Games with Percentages Strategies for Winning a Ratio Game Correct Analysis of Ratios
51
© 1998, Geoff Kuenning Choosing a Base System Run workloads on two systems Normalize performance to chosen system Take average of ratios Presto: you control what’s best
52
Code Size Example ProgramRISC-1Z8002R/RZ/R F-bit1201801.01.5 Acker1443021.02.1 Towers962401.02.5 Puzzle279613981.00.5 Sum315621204.06.6 Average7895301.01.6 or.67?
53
Simple Example Program121/22/1 A501000.52.0 B10005002.00.5 Sum10506001.750.57
54
Simple Example Program121/2 A501000.5 B100010010.0 Sum10502005.25 Ave5251005.25
55
© 1998, Geoff Kuenning Using Ratio Metrics Pick a metric that is itself a ratio –power = throughput response time –cost / performance –improvement ratio Handy because division is “hidden”
56
© 1998, Geoff Kuenning Relative Performance Enhancement Compare systems with incomparable bases Turn into ratios Example: compare Ficus 1 vs. 2 replicas with UFS vs. NFS (1 run on chosen day): “Proves” adding Ficus replica costs less than going from UFS to NFS
57
© 1998, Geoff Kuenning Ratio Games with Percentages Percentages are inherently ratios –But disguised –So great for ratio games Example: Passing tests A is worse, but looks better in total line!
58
© 1998, Geoff Kuenning More on Percentages Psychological impact –1000% sounds bigger than 10-fold (or 11-fold) –Great when both original and final performance are lousy E.g., salary went from $40 to $80 per week Small sample sizes generate big lies Base should be initial, not final value –E.g., price can’t drop 400%
59
Sequential page placement normalized to random placement for static policies -- SPEC True Confessions
60
Power state policies with random placement normalized to all active memory -- SPEC True Confessions
61
© 1998, Geoff Kuenning Strategies for Winning a Ratio Game Can you win? How to win
62
© 1998, Geoff Kuenning Can You Win the Ratio Game? If one system is better by all measures, a ratio game won’t work –But recall percent-passes example –And selecting the base lets you change the magnitude of the difference If each system wins on some measures, ratio games might be possible (but no promises) –May have to try all bases
63
© 1998, Geoff Kuenning How to Win Your Ratio Game For LB metrics, use your system as the base For HB metrics, use the other as a base If possible, adjust lengths of benchmarks –Elongate when your system performs best –Short when your system is worst –This gives greater weight to your strengths
64
For Discussion Next Tuesday Bring in one either notoriously bad or exceptionally good example of data presentation from your proceedings. The bad ones are more fun. Or if you find something just really different, please show it.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.