© 1998, Geoff Kuenning Vague idea “groping around” experiences Hypothesis Model Initial observations Experiment Data, analysis, interpretation Results.

Slides:



Advertisements
Similar presentations
Lecture 06: Design II February 5, 2013 COMP Visualization.
Advertisements

Topic 12 – Further Topics in ANOVA
1 CS533 Modeling and Performance Evaluation of Network and Computer Systems The Art of Data Presentation.
Making effective plots: 1.Don’t use default Excel plots! 2.Figure should highlight the key relationships in the data. 3.Should be clear - no extraneous.
Cartographic Principles: Map design
® Microsoft Office 2010 Excel Tutorial 4: Enhancing a Workbook with Charts and Graphs.
Copyright © 2009 Pearson Education, Inc. Chapter 29 Multiple Regression.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Lecture Slides Elementary Statistics Eleventh Edition and the Triola.
1 Normal Probability Distributions. 2 Review relative frequency histogram 1/10 2/10 4/10 2/10 1/10 Values of a variable, say test scores In.
Reading Graphs and Charts are more attractive and easy to understand than tables enable the reader to ‘see’ patterns in the data are easy to use for comparisons.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 3.1 Chapter Three Art and Science of Graphical Presentations.
1 The Normal Probability Distribution. 2 Review relative frequency histogram 1/10 2/10 4/10 2/10 1/10 Values of a variable, say test scores
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 2-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Graphing With Excel 2010 University of Michigan – Dearborn Science Learning Center Based on a presentation by James Golen Revised by Annette Sieg…
10-1 ©2006 Raj Jain The Art of Data Presentation.
LSP 120: Quantitative Reasoning and Technological Literacy Section 903 Özlem Elgün.
Chapter 2 Graphs, Charts, and Tables – Describing Your Data
1 Price elasticity of demand and revenue implications Often in economics we look at how the value of one variable changes when another variable changes.
Lecture 13 Page 1 CS 239, Spring 2007 Data Presentation CS 239 Experimental Methodologies for System Software Peter Reiher May 22, 2007.
Data Visualization.
CS1100: Computer Science and Its Applications Creating Graphs and Charts in Excel.
1 Determining Effective Data Display with Charts.
Week 4 LSP 120 Joanna Deszcz. 3 Types of Graphs used in QR  Pie Charts Very limited use Category sets must make a whole  XY Graphs or Line Graphs Use.
Simple Linear Regression 1. 2 I want to start this section with a story. Imagine we take everyone in the class and line them up from shortest to tallest.
OCR Functional Skills Charts Presenting data – Good data presentation skills are important. – Poor graphs and tables lead to the wrong conclusions being.
Jeffrey Nichols Displaying Quantitative Information May 2, 2003 Slide 0 Displaying Quantitative Information An exploration of Edward R. Tufte’s The Visual.
Graphics and visual information English 314 Technical communication Note: To hide or reveal these lecture notes, go to VIEW and click COMMENTS. This lecture.
Charts and Graphs V
CMPT 880/890 Writing labs. Outline Presenting quantitative data in visual form Tables, charts, maps, graphs, and diagrams Information visualization.
Variable  An item of data  Examples: –gender –test scores –weight  Value varies from one observation to another.
Ratio Games and Designing Experiments Andy Wang CIS Computer Systems Performance Analysis.
Graphics COM 365 Newspaper Layout & Design. Why graphics? Need them to break up text, liven up page –Adds visual element Allow journalist to show visual.
Quantitative Skills 1: Graphing
Integrating Graphics, Charts, Tables Into your technical writing documents.
Graphics – Part1. Why use graphics u Different learning styles u Many things are hard to explain in text u Provides interest u Relationships are visual.
The Scientific Method Honors Biology Laboratory Skills.
Chapter 10 The Art of Data Presentation. Overview 2 Types of Variables Guidelines for Preparing Good Charts Common Mistakes in Preparing Charts Pictorial.
Introduction to Graphical Presentation Andy Wang CIS Computer Systems Performance Analysis.
© 1998, Geoff Kuenning The Art of Graphical Presentation Types of Variables Guidelines for Good Graphics Charts Common Mistakes in Graphics Pictorial Games.
Visualizing Data in Excel Geof Hileman, FSA Kennell & Associates, Inc June 4, 2012.
Graphing Data: Introduction to Basic Graphs Grade 8 M.Cacciotti.
© 1998, Geoff Kuenning Common Mistakes in Graphics Excess information Multiple scales Using symbols in place of text Poor scales Using lines incorrectly.
Graphs in Physics PowerPoint #4. A graph is… A convenient way to show data.
Unit 4 Statistical Analysis Data Representations.
GrowingKnowing.com © Frequency distribution Given a 1000 rows of data, most people cannot see any useful information, just rows and rows of data.
1 CSE 2337 Chapter 3 Data Visualization With Excel.
GRAPHICS GUIDELINES MUSE/CE 11B Anagnos/Williamson From Pfeiffer, W.S Technical Writing: A Practical Approach. 5th Edition. Prentice Hall. New Jersey.
Worth 1,000 Words How to use information graphics to make data meaningful National Association for Career and Technical Education Information May 17, 2012.
Unit 2: Geographical Skills
Statistical Analysis Topic – Math skills requirements.
Proposal: Preliminary Results and Discussion. Dos and Don’ts DoDon’t Include initial results if you have them You can also conduct and report on informal.
1. Tables, Charts, and Graphs Microsoft Word & Excel 2003.
1 Running Experiments for Your Term Projects Dana S. Nau CMSC 722, AI Planning University of Maryland Lecture slides for Automated Planning: Theory and.
ANOVA, Regression and Multiple Regression March
Statistical Fundamentals: Using Microsoft Excel for Univariate and Bivariate Analysis Alfred P. Rovai Charts Overview PowerPoint Prepared by Alfred P.
UNIT ONE: Science Skills  Chapter 1Measurement  Chapter 2The Scientific Process  Chapter 3Mapping Earth.
Integrating Graphics, Illustrations, Figures, Charts.
Survey Training Pack Session 20 – Presentation of Findings.
Week 2 Normal Distributions, Scatter Plots, Regression and Random.
26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.
2. Graphing Sci. Info Skills.
Ratio Games and Designing Experiments
Tutorial 4: Enhancing a Workbook with Charts and Graphs
QM222 A1 Visualizing data using Excel graphs
Proposal: Preliminary Results and Discussion
Module 6: Presenting Data: Graphs and Charts
Technical Writing (AEEE299)
More on Data Presentation CS 239 Experimental Methodologies for System Software Peter Reiher May 24, 2007.
Topic 7: Visualization Lesson 1 – Creating Charts in Excel
The Art of Graphical Presentation
Presentation transcript:

© 1998, Geoff Kuenning Vague idea “groping around” experiences Hypothesis Model Initial observations Experiment Data, analysis, interpretation Results & final Presentation Experimental Lifecycle

© 1998, Geoff Kuenning The Art of Graphical Presentation Types of Variables Guidelines for Good Graphics Charts Common Mistakes in Graphics Pictorial Games Special-Purpose Charts

© 1998, Geoff Kuenning Types of Variables Qualitative –Ordered (e.g., modem, Ethernet, satellite) –Unordered (e.g., CS, math, literature) Quantitative –Discrete (e.g., number of terminals) –Continuous (e.g., time)

© 1998, Geoff Kuenning Charting Based on Variable Types Qualitative variables usually work best with bar charts or Kiviat graphs –If ordered, use bar charts to show order Quantitative variables work well in X-Y graphs –Use points if discrete, lines if continuous –Bar charts sometimes work well for discrete

© 1998, Geoff Kuenning Guidelines for Good Graphics Charts Principles of graphical excellence Principles of good graphics Specific hints for specific situations Aesthetics Friendliness

© 1998, Geoff Kuenning Principles of Graphical Excellence Graphical excellence is the well- designed presentation of interesting data: –Substance –Statistics –Design

© 1998, Geoff Kuenning Graphical Excellence (2) Complex ideas get communicated with: –Clarity –Precision –Efficiency

© 1998, Geoff Kuenning Graphical Excellence (3) Viewer gets: –Greatest number of ideas –In the shortest time –With the least ink –In the smallest space

© 1998, Geoff Kuenning Graphical Excellence (4) Is nearly always multivariate Requires telling truth about data

© 1998, Geoff Kuenning Principles of Good Graphics Above all else show the data Maximize the data-ink ratio Erase non-data ink Erase redundant data ink Revise and edit

© 1998, Geoff Kuenning Above All Else Show the Data

© 1998, Geoff Kuenning Above All Else Show the Data

© 1998, Geoff Kuenning Maximize the Data-Ink Ratio

© 1998, Geoff Kuenning Maximize the Data-Ink Ratio

© 1998, Geoff Kuenning Erase Non-Data Ink

© 1998, Geoff Kuenning Erase Non-Data Ink East West North

© 1998, Geoff Kuenning Erase Redundant Data Ink East West North

© 1998, Geoff Kuenning Erase Redundant Data Ink East West North

© 1998, Geoff Kuenning Revise and Edit

© 1998, Geoff Kuenning Revise and Edit

© 1998, Geoff Kuenning Revise and Edit

© 1998, Geoff Kuenning Revise and Edit

© 1998, Geoff Kuenning Revise and Edit

© 1998, Geoff Kuenning Revise and Edit

© 1998, Geoff Kuenning Revise and Edit

© 1998, Geoff Kuenning Specific Things to Do Give information the reader needs Limit complexity and confusion Have a point Show statistics graphically Don’t always use graphics Discuss it in the text

© 1998, Geoff Kuenning Give Information the Reader Needs Show informative axes –Use axes to indicate range Label things fully and intelligently Highlight important points on the graph

© 1998, Geoff Kuenning Giving Information the Reader Needs

© 1998, Geoff Kuenning Giving Information the Reader Needs

© 1998, Geoff Kuenning Limit Complexity and Confusion Not too many curves Single scale for all curves No “extra” curves No pointless decoration (“ducks”)

© 1998, Geoff Kuenning Limiting Complexity and Confusion

© 1998, Geoff Kuenning Limiting Complexity and Confusion

© 1998, Geoff Kuenning Confusion again

© 1998, Geoff Kuenning Have a Point Graphs should add information not otherwise available to reader Don’t plot data just because you collected it Know what you’re trying to show, and make sure the graph shows it

© 1998, Geoff Kuenning Having a Point Sales were up 15% this quarter:

© 1998, Geoff Kuenning Having a Point

© 1998, Geoff Kuenning Having a Point

© 1998, Geoff Kuenning Having a Point

© 1998, Geoff Kuenning Show Statistics Graphically Put bars in a reasonable order –Geographical –Best to worst –Even alphabetic Make bar widths reflect interval widths –Hard to do with most graphing software Show confidence intervals on the graph –Examples will be shown later

© 1998, Geoff Kuenning Don’t Always Use Graphics Tables are best for small sets of numbers –e.g., 20 or fewer Also best for certain arrangements of data –e.g., 10 graphs of 3 points each Sometimes a simple sentence will do Always ask whether the chart is the best way to present the information –And whether it brings out your message

© 1998, Geoff Kuenning Text Would Have Been Better

© 1998, Geoff Kuenning Discuss It in the Text Figures should be self-explanatory –Many people scan papers, just look at graphs –Good graphs build interest, “hook” readers But text should highlight and aid figures –Tell readers when to look at figures –Point out what figure is telling them –Expand on what figure has to say

© 1998, Geoff Kuenning Aesthetics Not everyone is an artist –But figures should be visually pleasing Elegance is found in –Simplicity of design –Complexity of data

© 1998, Geoff Kuenning Principles of Aesthetics Use appropriate format and design Use words, numbers, drawings together Reflect balance, proportion, relevant scale Keep detail and complexity accessible Have a story about the data (narrative quality) Do a professional job of drawing Avoid decoration and chartjunk

© 1998, Geoff Kuenning Use Appropriate Format and Design Don’t automatically draw a graph –We’ve covered this before Choose graphical format carefully Sometimes a “text graphic” works best –Use text placement to communicate numbers –Very close to being a table

© 1998, Geoff Kuenning GNP: +3.8IPG: +5.8CPI: +7.7Profits: CEA: +4.7 DR: +4.5 NABE: +4.5 WEF: +4.5 CBO: +4.4 CB: +4.2 IBM: +4.1 CE: +2.9 NABE: +6.2 IBM: +5.9 CB: +5.5 DR: +5.2 WEF: +4.8 IBM: +6.6 NABE: +6.5 CB: +6.2 WEF: +21 DR: IBM: CE: +6.5 WEF: 6.8 CB: 6.7 NABE: 6.7 IBM: 6.6 DR: 6.5 CBO: 6.3 CEA: 6.3 Unempl: 6.0 About a year ago, eight forecasters were asked for their predictions on some key economic indicators. Here’s how the forecasts stack up against the probable 1978 results (shown in the black panel). (New York Times, Jan. 2, 1979) Using Text as a Graphic

© 1998, Geoff Kuenning The Stem-and-Leaf Plot From Tukey, via Tufte, heights of volcanoes in feet: 0| | | | | | | | |

© 1998, Geoff Kuenning Choosing a Graphical Format Many options, more being invented all the time –Examples will be given later –See Jain for some commonly useful ones –Tufte shows ways to get creative Choose a format that reflects your data –Or that helps you analyze it yourself

© 1998, Geoff Kuenning Use Words, Numbers, Drawings Together Put graphics near or in text that discusses them –Even if you have to murder your word processor Integrate text into graphics Tufte: “Data graphics are paragraphs about data and should be treated as such”

© 1998, Geoff Kuenning Reflect Balance, Proportion, Relevant Scale Much of this boils down to “artistic sense” Make sure things are big enough to read –Tiny type is OK only for young people! Keep lines thin –But use heavier lines to indicate important information Keep horizontal larger than vertical –About 50% larger works well

© 1998, Geoff Kuenning Poor Balance and Proportion Sales in the North and West districts were steady through all quarters East sales varied widely, significantly outperforming the other districts in the third quarter

© 1998, Geoff Kuenning Better Proportion Sales in the North and West districts were steady through all quarters East sales varied widely, significantly outperforming the other districts in the third quarter

© 1998, Geoff Kuenning Keep Detail and Complexity Accessible Make your graphics friendly: –Avoid abbreviations and encodings –Run words left-to-right –Explain data with little messages –Label graphic, don’t use elaborate shadings and a complex legend –Avoid red/green distinctions –Use clean, serif fonts in mixed case

© 1998, Geoff Kuenning An Unfriendly Graph

© 1998, Geoff Kuenning A Friendly Version

© 1998, Geoff Kuenning Even Friendlier

© 1998, Geoff Kuenning Have a Story About the Data (Narrative Quality) May be difficult in technical papers But think about why you are drawing graph Example: –Performance is controlled by network speed –But it tops out at the high end –And that’s because we hit a CPU bottleneck

© 1998, Geoff Kuenning Showing a Story About the Data

© 1998, Geoff Kuenning Do a Professional Job of Drawing This is easy with modern tools –But take the time to do it right Align things carefully Check the final version in the format you will use –I.e., print the Postscript one last time before submission –Or look at your slides on the projection screen

© 1998, Geoff Kuenning Avoid Decoration and Chartjunk Powerpoint, etc. make chartjunk easy Avoid clip art, automatic backgrounds, etc. Remember: the data is the story –Statistics aren’t boring –Uninterested readers aren’t drawn by cartoons –Interested readers are distracted Does removing it change the message? –If not, leave it out

© 1998, Geoff Kuenning Examples of Chartjunk Gridlines! Vibration Pointless Fake 3-D Effects Filled “Floor”Clip Art In or out? Filled “Walls” Borders and Fills Galore Unintentional Heavy or Double Lines Filled Labels

© 1998, Geoff Kuenning Common Mistakes in Graphics Excess information Multiple scales Using symbols in place of text Poor scales Using lines incorrectly

© 1998, Geoff Kuenning Excess Information Sneaky trick to meet length limits Rules of thumb: –6 curves on line chart –10 bars on bar chart –8 slices on pie chart Extract essence, don’t cram things in

© 1998, Geoff Kuenning Way Too Much Information

© 1998, Geoff Kuenning What’s Important About That Chart? Times for cp and rcp rise with number of replicas Most other benchmarks are near constant Exactly constant for rm

© 1998, Geoff Kuenning The Right Amount of Information

True Confessions

© 1998, Geoff Kuenning Multiple Scales Another way to meet length limits Basically, two graphs overlaid on each other Confuses reader (which line goes with which scale?) Misstates relationships –Implies equality of magnitude that doesn’t exist Start here

© 1998, Geoff Kuenning Some Especially Bad Multiple Scales

© 1998, Geoff Kuenning Using Symbols in Place of Text Graphics should be self-explanatory –Remember that the graphs often draw the reader in So use explanatory text, not symbols This means no Greek letters! –Unless your conference is in Athens...

© 1998, Geoff Kuenning It’s All Greek To Me...

© 1998, Geoff Kuenning Explanation is Easy

© 1998, Geoff Kuenning Poor Scales Plotting programs love non-zero origins –But people are used to zero Fiddle with axis ranges (and logarithms) to get your message across –But don’t lie or cheat Sometimes trimming off high ends makes things clearer –Brings out low-end detail

© 1998, Geoff Kuenning Nonzero Origins (Chosen by Microsoft)

© 1998, Geoff Kuenning Proper Origins

© 1998, Geoff Kuenning A Poor Axis Range

© 1998, Geoff Kuenning A Logarithmic Range

© 1998, Geoff Kuenning A Truncated Range

© 1998, Geoff Kuenning Using Lines Incorrectly Don’t connect points unless interpolation is meaningful Don’t smooth lines that are based on samples –Exception: fitted non-linear curves

© 1998, Geoff Kuenning Incorrect Line Usage

© 1998, Geoff Kuenning Pictorial Games Non-zero origins and broken scales Double-whammy graphs Omitting confidence intervals Scaling by height, not area Poor histogram cell size

© 1998, Geoff Kuenning Non-Zero Origins and Broken Scales People expect (0,0) origins –Subconsciously So non-zero origins are a great way to lie More common than not in popular press Also very common to cheat by omitting part of scale –“Really, Your Honor, I included (0,0)”

© 1998, Geoff Kuenning Non-Zero Origins

© 1998, Geoff Kuenning The Three-Quarters Rule Highest point should be 3/4 of scale or more

© 1998, Geoff Kuenning Double-Whammy Graphs Put two related measures on same graph –One is (almost) function of other Hits reader twice with same information –And thus overstates impact

© 1998, Geoff Kuenning Omitting Confidence Intervals Statistical data is inherently fuzzy But means appear precise Giving confidence intervals can make it clear there’s no real difference –So liars and fools leave them out

© 1998, Geoff Kuenning Graph Without Confidence Intervals

© 1998, Geoff Kuenning Graph With Confidence Intervals

Confidence Intervals Sample mean value is only an estimate of the true population mean Bounds c 1 and c 2 such that there is a high probability, 1- , that the population mean is in the interval (c 1,c 2 ): Prob{ c 1 <  < c 2 } =1-  where  is the significance level and 100(1-  ) is the confidence level Overlapping confidence intervals is interpreted as “not statistically different”

© 1998, Geoff Kuenning Graph With Confidence Intervals

Reporting Only One Run (tell-tale sign) Probably a fluke (It’s likely that with multiple trials this would go away)

© 1998, Geoff Kuenning Scaling by Height Instead of Area Clip art is popular with illustrators: Women in the Workforce Any quesses? w1980/w1960 = ?

© 1998, Geoff Kuenning The Trouble with Height Scaling Previous graph had heights of 2:1 But people perceive areas, not heights –So areas should be what’s proportional to data Tufte defines a lie factor: size of effect in graphic divided by size of effect in data –Not limited to area scaling –But especially insidious there (quadratic effect)

© 1998, Geoff Kuenning Scaling by Area Here’s the same graph with 2:1 area: Women in the Workforce

© 1998, Geoff Kuenning Histogram Cell Size Picking bucket size is always a problem Prefer 5 or more observations per bucket Choice of bucket size can affect results:

Histogram Cell Size Picking bucket size is always a problem Prefer 5 or more observations per bucket Choice of bucket size can affect results:

Histogram Cell Size Picking bucket size is always a problem Prefer 5 or more observations per bucket Choice of bucket size can affect results:

© 1998, Geoff Kuenning Don’t Quote Data Out of Context

© 1998, Geoff Kuenning The Same Data in Context

Tell the Whole Truth

© 1998, Geoff Kuenning Special-Purpose Charts Histograms Scatter plots Gantt charts Kiviat graphs

© 1998, Geoff Kuenning Tukey’s Box Plot Shows range, median, quartiles all in one: Variations: minimummaximumquartile median

© 1998, Geoff Kuenning Histograms

© 1998, Geoff Kuenning Scatter Plots Useful in statistical analysis Also excellent for huge quantities of data –Can show patterns otherwise invisible

© 1998, Geoff Kuenning Better Scatter Plots Again, Tufte improves the standard –But it can be a pain with automated tools Can use modified Tukey box plot for axes

© 1998, Geoff Kuenning Gantt Charts Shows relative duration of Boolean conditions Arranged to make lines continuous –Each level after first follows FTTF pattern

© 1998, Geoff Kuenning Gantt Charts Shows relative duration of Boolean conditions Arranged to make lines continuous –Each level after first follows FTTF pattern T TT TTTT F FF FFFF

© 1998, Geoff Kuenning Kiviat Graphs Also called “star charts” or “radar plots” Useful for looking at balance between HB and LB metrics HB LB

© 1998, Geoff Kuenning Useful Reference Works Edward R. Tufte, The Visual Display of Quantitative Information, Graphics Press, Cheshire, Connecticut, Edward R. Tufte, Envisioning Information, Graphics Press, Cheshire, Connecticut, Edward R. Tufte, Visual Explanations, Graphics Press, Cheshire, Connecticut, Darrell Huff, How to Lie With Statistics, W.W. Norton & Co., New York, 1954

© 1998, Geoff Kuenning Ratio Games Choosing a Base System Using Ratio Metrics Relative Performance Enhancement Ratio Games with Percentages Strategies for Winning a Ratio Game Correct Analysis of Ratios

© 1998, Geoff Kuenning Choosing a Base System Run workloads on two systems Normalize performance to chosen system Take average of ratios Presto: you control what’s best

Code Size Example ProgramRISC-1Z8002R/RZ/R F-bit Acker Towers Puzzle Sum Average or.67?

Simple Example Program121/22/1 A B Sum

© 1998, Geoff Kuenning Using Ratio Metrics Pick a metric that is itself a ratio –power = throughput  response time –cost / performance –improvement ratio Handy because division is “hidden”

© 1998, Geoff Kuenning Relative Performance Enhancement Compare systems with incomparable bases Turn into ratios Example: compare Ficus 1 vs. 2 replicas with UFS vs. NFS (1 run on chosen day): “Proves” adding Ficus replica costs less than going from UFS to NFS

© 1998, Geoff Kuenning Ratio Games with Percentages Percentages are inherently ratios –But disguised –So great for ratio games Example: Passing tests A is worse, but looks better in total line!

© 1998, Geoff Kuenning More on Percentages Psychological impact –1000% sounds bigger than 10-fold (or 11-fold) –Great when both original and final performance are lousy E.g., salary went from $40 to $80 per week Small sample sizes generate big lies Base should be initial, not final value –E.g., price can’t drop 400%

Sequential page placement normalized to random placement for static policies -- SPEC True Confessions

Power state policies with random placement normalized to all active memory -- SPEC True Confessions

© 1998, Geoff Kuenning Strategies for Winning a Ratio Game Can you win? How to win

© 1998, Geoff Kuenning Can You Win the Ratio Game? If one system is better by all measures, a ratio game won’t work –But recall percent-passes example –And selecting the base lets you change the magnitude of the difference If each system wins on some measures, ratio games might be possible (but no promises) –May have to try all bases

© 1998, Geoff Kuenning How to Win Your Ratio Game For LB metrics, use your system as the base For HB metrics, use the other as a base If possible, adjust lengths of benchmarks –Elongate when your system performs best –Short when your system is worst –This gives greater weight to your strengths

For Discussion Next Tuesday Bring in one either notoriously bad or exceptionally good example of data presentation from your proceedings. The bad ones are more fun. Or if you find something just really different, please show it.