10-1 ©2006 Raj Jain www.rajjain.com The Art of Data Presentation.

Slides:



Advertisements
Similar presentations
Excursions in Modern Mathematics, 7e: Copyright © 2010 Pearson Education, Inc. 14 Descriptive Statistics 14.1Graphical Descriptions of Data 14.2Variables.
Advertisements

Unit 1.1 Investigating Data 1. Frequency and Histograms CCSS: S.ID.1 Represent data with plots on the real number line (dot plots, histograms, and box.
1 CS533 Modeling and Performance Evaluation of Network and Computer Systems The Art of Data Presentation (Chapters 10 and 11)
1 CS533 Modeling and Performance Evaluation of Network and Computer Systems The Art of Data Presentation.
Evaluation of Speech Detection Algorithm Project 1b Due October 11.
Project 1b Evaluation of Speech Detection Due: February 17 th, at the beginning of class.
Understanding and Comparing Distributions 30 min.
Chapter 5: Understanding and Comparing Distributions
Reading Graphs and Charts are more attractive and easy to understand than tables enable the reader to ‘see’ patterns in the data are easy to use for comparisons.
Introduction to Summary Statistics
Chapter 2 Presenting Data in Tables and Charts
Ch. 2: The Art of Presenting Data Data in raw form are usually not easy to use for decision making. Some type of organization is needed Table and Graph.
Comparing Systems Using Sample Data
Summarizing Measured Data Part I Visualization (Chap 10) Part II Data Summary (Chap 12)
B a c kn e x t h o m e Classification of Variables Discrete Numerical Variable A variable that produces a response that comes from a counting process.
Inferences About Process Quality
Understanding and Comparing Distributions
Describing Data with Tables and Graphs.  A frequency distribution is a collection of observations produced by sorting observations into classes and showing.
Hydrologic Statistics
Vectors and Two-Dimensional Motion
Programming in R Describing Univariate and Multivariate data.
Copyright © 2010, 2007, 2004 Pearson Education, Inc.
Exploratory Data Analysis. Computing Science, University of Aberdeen2 Introduction Applying data mining (InfoVis as well) techniques requires gaining.
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 2-1 What is a Frequency Distribution? A frequency distribution is a list or a.
©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Describing Data: Frequency Tables, Frequency Distributions, and Graphic Presentation Chapter 2.
Tutor: Prof. A. Taleb-Bendiab Contact: Telephone: +44 (0) CMPDLLM002 Research Methods Lecture 9: Quantitative.
One-Factor Experiments Andy Wang CIS 5930 Computer Systems Performance Analysis.
CPE 619 The Art of Data Presentation
Graphical Analysis. Why Graph Data? Graphical methods Require very little training Easy to use Massive amounts of data can be presented more readily Can.
Ratio Games and Designing Experiments Andy Wang CIS Computer Systems Performance Analysis.
Quantitative Skills 1: Graphing
Introduction to Summary Statistics. Statistics The collection, evaluation, and interpretation of data Statistical analysis of measurements can help verify.
The Scientific Method Honors Biology Laboratory Skills.
Chapter 10 The Art of Data Presentation. Overview 2 Types of Variables Guidelines for Preparing Good Charts Common Mistakes in Preparing Charts Pictorial.
Copyright © 2009 Pearson Education, Inc. Chapter 5 Understanding and Comparing Distributions.
Chapter 2 Describing Data.
ICOM 6115: Computer Systems Performance Measurement and Evaluation August 11, 2006.
© 1998, Geoff Kuenning Common Mistakes in Graphics Excess information Multiple scales Using symbols in place of text Poor scales Using lines incorrectly.
A Process Control Screen for Multiple Stream Processes An Operator Friendly Approach Richard E. Clark Process & Product Analysis.
1 Graphs Greg C Elvers, Ph.D.. 2 What Are Graphs? Graphs are a non-textual means of presenting information Graphs quickly summarize large sets of data.
Data Collection and Processing (DCP) 1. Key Aspects (1) DCPRecording Raw Data Processing Raw Data Presenting Processed Data CompleteRecords appropriate.
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 2-1 Chapter 2 Presenting Data in Tables and Charts Statistics For Managers 4 th.
Applied Quantitative Analysis and Practices
GRAPHING A “PICTURE” OF THE RELATIONSHIP BETWEEN THE INDEPENDENT AND DEPENDENT VARIABLES.
Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 5 Understanding and Comparing Distributions.
Chapter Eight: Using Statistics to Answer Questions.
Discovering Mathematics Week 5 BOOK A - Unit 4: Statistical Summaries 1.
Surveillance and Population-based Prevention Department for Prevention of Noncommunicable Diseases Displaying data and interpreting results.
Tables and Graphs. Graphs: Visual Display of Data X Axis: Independent Variable Y Axis: Dependent Variable.
Statistical Fundamentals: Using Microsoft Excel for Univariate and Bivariate Analysis Alfred P. Rovai Charts Overview PowerPoint Prepared by Alfred P.
Copyright © 2009 Pearson Education, Inc. Slide 4- 1 Practice – Ch4 #26: A meteorologist preparing a talk about global warming compiled a list of weekly.
 Line Graphs: are used to show something changing over time.  Bar Graphs: are used to show a comparison between two or more variables.  Pie Chart:
Plotting in Excel KY San Jose State University Engineering 10.
Exploratory Data Analysis
Exploring Data: Summary Statistics and Visualizations
INTRODUCTION TO STATISTICS
Understanding and Comparing Distributions
Describing Distributions Numerically
Introduction to Summary Statistics
Introduction to Summary Statistics
Introduction to Summary Statistics
Introduction to Summary Statistics
Descriptive Intervals
More on Data Presentation CS 239 Experimental Methodologies for System Software Peter Reiher May 24, 2007.
Introduction to Summary Statistics
Introduction to Summary Statistics
Introduction to Summary Statistics
Chapter Nine: Using Statistics to Answer Questions
Introduction to Summary Statistics
Presentation transcript:

10-1 ©2006 Raj Jain The Art of Data Presentation

10-2 ©2006 Raj Jain Overview  Types of Variables  Guidelines for Preparing Good Charts  Common Mistakes in Preparing Charts  Pictorial Games  Special Charts for Computer Performance  Gantt Charts  Kiviat Graphs  Schumacher Charts  Decision Maker’s Games

10-3 ©2006 Raj Jain Types of Variables  Type of computer: Super computer, minicomputer, microcomputer  Type of Workload: Scientific, engineering, educational  Number of processors  Response time of system

10-4 ©2006 Raj Jain Guidelines for Preparing Good Charts  Require minimum effort from the reader Direct labeling vs. legend box  Maximize Information: Words in place of symbols Cleary label the axes

10-5 ©2006 Raj Jain Guidelines (cont)  Minimize Ink: No grid lines, more details  Use Commonly accepted practices: origin at (0,0) Independent variable (cause) along x axis, linear scales, increasing scales, equal divisions  Avoid ambiguity: Show coordinate axes, scale divisions, origin. Identify individual curves and bars.  See checklist in Box 10.1

10-6 ©2006 Raj Jain Common Mistakes in Preparing Charts  Presenting too many alternatives on a single chart Max 5 to 7 messages => Max 6 curves in a line charts, no more than 10 bars in a bar chart, max 8 components in a pie chart  Presenting many y variables on a single chart

10-7 ©2006 Raj Jain Common Mistakes in Charts (Cont)  Using symbols in place of text  Placing extraneous information on the chart: grid lines, granularity of the grid lines  Selecting scale ranges improperly: automatic selection by programs may not be appropriate

10-8 ©2006 Raj Jain Common Mistakes in Charts (Cont)  Using a line chart in place of column chart: line => Continuity CPU Type MIPS

10-9 ©2006 Raj Jain Pictorial Games  Using non-zero origins to emphasize the difference Three quarter high-rule => height/width > 3/4

10-10 ©2006 Raj Jain Pictorial Games (Cont)  Using double-whammy graph for dramatization Using related metrics

10-11 ©2006 Raj Jain Pictorial Games (Cont)  Plotting random quantities without showing confidence intervals

10-12 ©2006 Raj Jain Pictorial Games (Cont)  Pictograms scaled by height Mine Performance = 2 Yours Performance = 1

10-13 ©2006 Raj Jain Pictorial Games (Cont)  Using inappropriate cell size in histograms [0,2)[2,4)[4,6)[6,8)[8,10)[10,12)[0,6)[6,12) Response Time Frequency

10-14 ©2006 Raj Jain Pictorial Games (Cont)  Using broken scales in column charts A System Resp. Time BCDE F A System Resp. Time 0 BCDE F

10-15 ©2006 Raj Jain Special Charts for Computer Performance  Gantt charts  Kiviat Graphs  Schumacher's charts

10-16 ©2006 Raj Jain Gantt Charts  Shows relative duration of a number of conditions CPU IO Channel Network 20%40%60%80%100%0% Utilization

10-17 ©2006 Raj Jain Example: Data for Gantt Chart

10-18 ©2006 Raj Jain Draft of the Gantt Chart

10-19 ©2006 Raj Jain Final Gantt Chart

10-20 ©2006 Raj Jain Kiviat Graphs  Radial chart with even number of metrics  HB and LB metrics alternate  Ideal shape: star CPU Busy CPU in Supervisor State CPU in Problem State CPU Wait Any Channel Busy Channel only Busy CPU/Channel Overlap CPU Only Busy

10-21 ©2006 Raj Jain Kiviat Graph for a Balanced System  Problem: Inter-related metrics CPU busy = problem state + Supervisor state CPU wait = 100 – CPU busy Channel only – any channel –CPU/channel overlap CPU only = CPU busy – CPU/channel overlap CPU Busy CPU in Supervisor State CPU in Problem State CPU Wait Any Channel Busy Channel only Busy CPU/Channel Overlap CPU Only Busy

10-22 ©2006 Raj Jain Shapes of Kiviat Graphs CPU Keel boatI/O WedgeI/O Arrow

10-23 ©2006 Raj Jain Merrill’s Figure of Merit (FoM)  Performance = {x 1, x 2, x 3, …, x 2n } Odd values are HB and even values are LB  x 2n+1 is the same as x 1  Average FOM = 50%

10-24 ©2006 Raj Jain Example: FoM  System A:

10-25 ©2006 Raj Jain FoM Example (Cont)  System B: System B has a higher figure of merit and it is better.

10-26 ©2006 Raj Jain Figure of Merit: Known Problems  All axes are considered equal  Extreme values are assumed to be better  Utility is not a linear function of FoM  Two systems with the same FoM are not equally good.  System with slightly lower FoM may be better

10-27 ©2006 Raj Jain Kiviat Graphs For Other Systems  Networks: Application Throughput Packets With Error Implicit Acks Duplicate Packets Link Utilization Link Overhead

10-28 ©2006 Raj Jain Schumacher Charts  Performance matrix are plotted in a tabular manner  Values are normalized with respect to long term means and standard deviations  Any observations that are beyond mean  one standard deviation need to be explained  See Figure in the book

10-29 ©2006 Raj Jain Performance Analysis Rat Holes ConfigurationWorkloadMetricsDetails

10-30 ©2006 Raj Jain Reasons for not Accepting an Analysis  This needs more analysis.  You need a better understanding of the workload.  It improves performance only for long IOs/packets/jobs/files, and most of the IOs/packets/jobs/files are short.  It improves performance only for short IOs/packets/jobs/files, but who cares for the performance of short IOs/packets/jobs/files, its the long ones that impact the system.  It needs too much memory/CPU/bandwidth and memory/CPU/bandwidth isn't free.  It only saves us memory/CPU/bandwidth and memory/CPU/bandwidth is cheap. See Box 10.2 on page 162 of the book for a complete list

10-31 ©2006 Raj Jain Summary 1.Qualitative/quantitative, ordered/unordered, discrete/continuous variables 2.Good charts should require minimum effort from the reader and provide maximum information with minimum ink 3.Use no more than 5-6 curves, select ranges properly, Three- quarter high rule 4.Gantt Charts show utilizations of various components 5.Kiviat Graphs show HB and LB metrics alternatively on a circular graph 6.Schumacher Charts show mean and standard deviations 7.Workload, metrics, configuration, and details can always be challenged. Should be carefully selected.

10-32 ©2006 Raj Jain Ratio Games

10-33 ©2006 Raj Jain Overview  Ratio Game Examples  Using an Appropriate Ratio Metric  Using Relative Performance Enhancement  Ratio Games with Percentages  Ratio Games Guidelines  Numerical Conditions for Ratio Games

10-34 ©2006 Raj Jain Using an Appropriate Ratio Metric 1.Throughput: A is better 2.Response Time: A is worse 3.Power (Ratio): A is better  could be a contradictory conclusion Example:

10-35 ©2006 Raj Jain Using Relative Performance Enhancement  Example: Two floating point accelerators  Problem: Incomparable bases. Need to try both on the same machine

10-36 ©2006 Raj Jain Ratio Games with Percentages  Example: Tests on two systems 1. System B is better on both systems 2. System A is better overall. System A: System B:

10-37 ©2006 Raj Jain Ratio Games Guidelines 1. If one system is better on all benchmarks, contradicting conclusions can not be drawn by any ratio game technique

10-38 ©2006 Raj Jain Guidelines (cont) 2. Even if one system is better than the other on all benchmarks, a better relative performance can be shown by selecting appropriate base.  In the previous example, System A is 40% better than System B using raw data, 43% better using system A as a base, and 42% better using System B as a base. 3.If a system is better on some benchmarks and worse on others, contracting conclusions can be drawn in some cases. Not in all cases. 4.If the performance metric is an LB metric, it is better to use your system as the base 5.If the performance metric is an HB metric, it is better to use your opponent as the base 6.Those benchmarks that perform better on your system should be elongated and those that perform worse should be shortened

10-39 ©2006 Raj Jain Numerical Conditions for Ratio Games  A is better than B iff  Raw Data:  With A as the Base:

10-40 ©2006 Raj Jain Numerical Conditions (Cont) Ratio of B/A response on benchmark i Ratio of B/A response on benchmark j Raw Data Base B Base A A is better using all 3 B is better using all 3

10-41 ©2006 Raj Jain Summary  Ratio games arise from use of incomparable bases  Ratios may be part of the metric  Relative performance enhancements  Percentages are ratios  For HB metrics, it is better to use opponent as the base