Presentation is loading. Please wait.

Presentation is loading. Please wait.

Visualization Analytics

Similar presentations


Presentation on theme: "Visualization Analytics"— Presentation transcript:

1 Visualization Analytics
Principles

2 Visual Analytics motivation

3 Mean of y 7.50 to 2 decimal places
Chatterjee, Sangit; Firat, Aykut (2007). "Generating Data with Identical Statistics but Dissimilar Graphics: A Follow up to the Anscombe Dataset". American Statistician. 61 (3): 248–254. doi: / X2200 Mean of x=9 exact Sample variance of x= 11 Mean of y 7.50 to 2 decimal places Sample variance of y 4.125plus/minus 0.003 Correlation between x and y0.816to 3 decimal places Linear regression liney = 3.00 + 0.500xto 2 and 3 decimal places, respectively

4 Shneiderman’s Event Quartet: #1

5 Shneiderman’s Event Quartet: #2
Website statistics show high weekday visitation with gaps on weekends

6 Shneiderman’s Event Quartet: #3
Slowdown: fewer events per unit time, e.g. earthquake aftershocks

7 Shneiderman’s Event Quartet: #4

8 Visual Analytics: Where Art Meets Science
"split-brain" research in the 1960s, such as that which later won Roger Sperry of Caltech a Nobel prize. The left brain is also referred to as the digital brain. It controls reading and writing, calculation, and logical thinking. The right brain is referred to as the analogbrain. It controls three-dimensional sense, creativity, and artistic senses. Visual Analytics gives us an opportunity to use our whole brain. Visual Analytics includes static graphics and automated analysis techniques with interactive visualizations for an effective understanding, reasoning and decision making. There are two applications of Visual Analytics: 1) messaging and 2) learning. Methods and tools are different for each application.

9 Table: Cancer Incidence by Type
Data from

10 Graph: Cancer Incidence by Type
In general a graphic is better than a table. A table is better in a few exceptions: Convey a handful of numbers Report precise values for lookup Small cases Graphic better for comparisons Data from

11 Pre-attentive processing

12 Target Selection Visual Cue: color

13 Target Selection Visual Cue: Shape

14 Target Selection Visual Cue: Conjunction

15 Boundary Detection

16 Good graphical principles
Visual Analytics Good graphical principles

17 Safety Graphics Wiki General Principles
Content Communication Information Annotation Axes Styles Techniques Types of plots Colors Graphics are almost always better than tables but not all graphics are equal. It should tell its story without a need for detailed explanatory text or supporting documents. Content Every graphic should stand on its own. It should tell a story . . Communication Tailor each graphic to its primary communication purpose Information Maximize the data-to-ink ratio Annotation Provide legible text and information Annotation Provide legible text and information Axes Design axes to aid interpretation of a graph Styles Make symbols and plot lines distinct and readable Techniques Use established techniques to clarify the message Types of plots Use the simplest plot that is appropriate for the information to be displayed Colors Make use of color if appropriate for the medium of communication

18 Graph 1a: Bar Chart of Distribution of Eye Irritation
Graph 2a. Bar Chart of Distribution of Eye Irritation. (graph before enhancements). The data are given in along with the code we used to create this Graph. The data are percent of subjects with eye irritation at five time points—weeks 1, 2, 4, 6, 8 and at endpoint. There is a lot of ink here, but the main information is in the percent of subjects, with confidence intervals (or some measure of variability—it is not clear what it is). Using weeks and end point as categorical variables doesn’t show the time differences between them. This example illustrate two common issues: - It is not optimal in terms of data-to-ink ratio, as only the height of the bars are important, rather than the filled bar parts themselves (At least this particular example has the virtue in this case of starting the bars at zero). - If one looks at what happens over time, it is not directly clear that all the time points are not equidistant in time, and the endpoint is just another set of bars, nor really distinguished clearly from the ‘over time’ view From paper: 5.2. Maximize data-to-ink ratio Our second example further illustrates the good graphing principle “maximize the data-to-ink ratio”. Graph 2a show the percentage of subjects with eye redness over time in a study for three treatment groups. The Graph is pleasing to the eye but has several possible areas for improvement. Much of the ink used in Graph 2a does not aid the reader in their interpretation of the data. In fact the important message of the plot is somewhat obscured. The main information is the percent of subjects, with confidence intervals (or some measure of variability—it is not clear what the measure of variability is). However, all the ink in the bars obscures this information. Only the height of the bars is important, rather than the filled bar parts themselves. Another problem with this graph is that the x-axis represents the continuous variable of time as a categorical variable. The eye irritation was measured at weeks 1, 2, 4, 6, and 8, but the spacing on the x-axis makes it appear that they were measured at equal time intervals. Additionally, the “End Point” is not clearly distinguished from the data at the specific weeks. A more minor issue is the choice of colors. The plot indicates that the three treatment groups are Placebo, Drug A and Drug B. Graph 2a uses sequential colors (light to dark) which might be a better choice for treatment groups that progress from low to high such as placebo, low dose, and high dose. In this case, it would be better to choose colors that suggest a qualitative difference between the groups. Qualitative schemes do not imply magnitude differences between legend classes and hues are used to create the primary visual differences between classes. (We do recognize, however, that depending on what Drug A and Drug B are, a sequential scheme might be appropriate.) Graphs that use an appropriate color scheme will communicate their messages more effectively. There are many resources to help statisticians choose effective color schemes, such as ColorBrewer 2.0: Color Advice for Maps [8] Lots of ink doesn’t help the message Not clear what is the measure of variability Endpoint just another set of bars, not distinguished from ‘over time’ info

19 Graph 1b: Dotplot of distribution of eye irritation
This also shows making the time in weeks as a quantitative variable, rather than categorical. It is subtle here, but now weeks 1 and 2 are visually closer than weeks 2, 4, 6 and 8. Endpoint is clearly separated from the time in weeks. Main message not obscured by all the ink. Weeks 1 and 2 visually closer than weeks 2, 4, 6 and 8. Endpoint clearly separated from time in weeks.

20 Pie Charts & Quantitative Information
Can you describe the data? Is there a pattern in the areas?

21 Dot Charts & Quantitative Information
Did you realizing some were 50% smaller than others?

22 Take home message....

23 Cognitive scale of visual cues for quantitative variables

24 Cognitive scale of visual cues for qualitative variables

25 New Drug Application (NDA)


Download ppt "Visualization Analytics"

Similar presentations


Ads by Google