Revealing Structure within Clustered Parallel Coordinates Displays Jimmy Johansson, Patric Ljung, Mikael Jern, Matthew Cooper NVIS- Norrkoping Visualization.

Slides:



Advertisements
Similar presentations
Statistical Reasoning for everyday life
Advertisements

C. D. Toliver AP Statistics
Measures of Dispersion
Chapter 13 Statistics © 2008 Pearson Addison-Wesley. All rights reserved.
Displaying & Summarizing Quantitative Data
Descriptive Statistics
Measures of Dispersion
Measures of Dispersion or Measures of Variability
MEASURES OF SPREAD – VARIABILITY- DIVERSITY- VARIATION-DISPERSION
Feature Screening Concept: A greedy feature selection method. Rank features and discard those whose ranking criterions are below the threshold. Problem:
Statistics: Use Graphs to Show Data Box Plots.
Unit 3 Section 3-4.
Chapter 2 Describing Data with Numerical Measurements
Statistics for Linguistics Students Michaelmas 2004 Week 1 Bettina Braun.
Describing distributions with numbers
LECTURE 12 Tuesday, 6 October STA291 Fall Five-Number Summary (Review) 2 Maximum, Upper Quartile, Median, Lower Quartile, Minimum Statistical Software.
Exploratory Data Analysis. Computing Science, University of Aberdeen2 Introduction Applying data mining (InfoVis as well) techniques requires gaining.
1 1 Slide © 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Census A survey to collect data on the entire population.   Data The facts and figures collected, analyzed, and summarized for presentation and.
Rules of Data Dispersion By using the mean and standard deviation, we can find the percentage of total observations that fall within the given interval.
Boxplots The boxplot is an informative way of displaying the distribution of a numerical variable.. It uses the five-figure summary: minimum, lower quartile,
Percentiles and Box – and – Whisker Plots Measures of central tendency show us the spread of data. Mean and standard deviation are useful with every day.
By: Amani Albraikan 1. 2  Synonym for variability  Often called “spread” or “scatter”  Indicator of consistency among a data set  Indicates how close.
Measures of Position. ● The standard deviation is a measure of dispersion that uses the same dimensions as the data (remember the empirical rule) ● The.
Tile-based parallel coordinates and its application in financial visualization Jamal Alsakran, Ye Zhao Kent State University, Department of Computer Science,
Lecture 16 Sec – Mon, Oct 2, 2006 Measuring Variation – The Five-Number Summary.
An Introduction to Statistics. Two Branches of Statistical Methods Descriptive statistics Techniques for describing data in abbreviated, symbolic fashion.
Categorical vs. Quantitative…
INVESTIGATION 1.
INVESTIGATION Data Colllection Data Presentation Tabulation Diagrams Graphs Descriptive Statistics Measures of Location Measures of Dispersion Measures.
Copyright © Cengage Learning. All rights reserved. 2 Descriptive Analysis and Presentation of Single-Variable Data.
Numerical Measures of Variability
EXAMPLE 3 Standardized Test Practice SOLUTION From Example 2, you know the interquartile range of the data is 0.9 inch. Find 1.5 times the interquartile.
 The mean is typically what is meant by the word “average.” The mean is perhaps the most common measure of central tendency.  The sample mean is written.
Engineering Statistics KANCHALA SUDTACHAT. Statistics  Deals with  Collection  Presentation  Analysis and use of data to make decision  Solve problems.
Chapter 3, Part B Descriptive Statistics: Numerical Measures n Measures of Distribution Shape, Relative Location, and Detecting Outliers n Exploratory.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 3 Section 4 – Slide 1 of 23 Chapter 3 Section 4 Measures of Position.
Chapter 5: Boxplots  Objective: To find the five-number summaries of data and create and analyze boxplots CHS Statistics.
Copyright © 2011 Pearson Education, Inc. Describing Numerical Data Chapter 4.
Data Summary Using Descriptive Measures Sections 3.1 – 3.6, 3.8
Descriptive Statistics Tabular and Graphical Displays –Frequency Distribution - List of intervals of values for a variable, and the number of occurrences.
Notes Unit 1 Chapters 2-5 Univariate Data. Statistics is the science of data. A set of data includes information about individuals. This information is.
Using Measures of Position (rather than value) to Describe Spread? 1.
Statistics topics from both Math 1 and Math 2, both featured on the GHSGT.
LIS 570 Summarising and presenting data - Univariate analysis.
Lecture 16 Sec – Tue, Feb 12, 2008 The Five-Number Summary.
Descriptive Statistics(Summary and Variability measures)
3/13/2016 Data Mining 1 Lecture 2-1 Data Exploration: Understanding Data Phayung Meesad, Ph.D. King Mongkut’s University of Technology North Bangkok (KMUTNB)
Chapter 6: Descriptive Statistics. Learning Objectives Describe statistical measures used in descriptive statistics Compute measures of central tendency.
(Unit 6) Formulas and Definitions:. Association. A connection between data values.
Describing Data Week 1 The W’s (Where do the Numbers come from?) Who: Who was measured? By Whom: Who did the measuring What: What was measured? Where:
Slide 1 Copyright © 2004 Pearson Education, Inc.  Descriptive Statistics summarize or describe the important characteristics of a known set of population.
7 th Grade Math Vocabulary Word, Definition, Model Emery Unit 4.
Elementary Statistics
Description of Data (Summary and Variability measures)
Laugh, and the world laughs with you. Weep and you weep alone
Descriptive Statistics
Topic 5: Exploring Quantitative data
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
Chapter 3 Section 4 Measures of Position.
Displaying Distributions with Graphs
Displaying and Summarizing Quantitative Data
Measuring Variation – The Five-Number Summary
Statistics and Data (Algebraic)
Lesson – Teacher Notes Standard:
The Five-Number Summary
Advanced Algebra Unit 1 Vocabulary
Measures of Dispersion
Chapter 12 Statistics.
Presentation transcript:

Revealing Structure within Clustered Parallel Coordinates Displays Jimmy Johansson, Patric Ljung, Mikael Jern, Matthew Cooper NVIS- Norrkoping Visualization and Interaction Studio, Linkoping University, Sweden Presented by Xinyu Chang Kent State University 1

1.Introduction 2.Related work 3.High-precision textures 4.Transfer function 5.Enhancing outliers 6.Feature animation 7.Conclusions and future work 2

1.Introduction 3

4 Multivariate data Data collected on several variables for each sampling unit. Introduction Introduction

5 Parallel coordinates A example of revealing multivariate data with parallel coordinates technique Introduction

6 Introduction Cluster All relate to grouping or segmenting a collection of objects into subsets or “clusters,” such that those within each cluster are more closely related to one another than objects assigned to deferent clusters. A clustering algorithm aims at grouping data items so that items in a cluster are as similar as possible and as different from data items in the other clusters as possible. Clusters can be expressed in different ways.

7 2.Related work

8 Y.H.Fua, M.O. Ward,and E.A.Rundensteiner G. Andrienko and N. Andrienko.2004 M. No votny Related work

High-precision textures

10 High-precision textures Framebuffer limitation When dealing with very large data sets, the number of lines needed to be rendered may cause limitations in interactivity. When the intensity level displayed by the parallel coordinates more than the number framebuffer supports, this information can not be revealed using additive blending. Solution Divide the data into subset equals to the capacity of the framebuffer, after the subset get rendered, accumulate them in a higher precision buffer. subset RENDER ED FRAMEBUFF ER Higher precision buffer DATA

Intensity range and transparency This value, ρ,is used to normalize the intensity range ensuring that all intensity information is retained. ρ i Equals to the value of overlapping lines. ρ global is the largest of ρ i. 11 High-precision textures If we normalize the resulting image using ρ global,clusters with a small maximum intersecting value will be more transparent. Normalizing each cluster individually using ρ i,each cluster’s maximum intersecting value will be perceived as equally dense. Both approaches are, in a sense correct, only highlighting different aspects.

12 4.Transfer function

13 Transfer function The concept of using a transfer function(TF)in conjunction with parallel coordinates to allow for a more powerful and customized analysis process. A TF defined as enables the user to easily investigate different aspects of the data by mapping a data value, s,to an opacity value, α.Both pre-defined as well as user-defined TFs may be used.

14 Transfer function

15 Transfer function

16 5.Enhancing outliers

17 Enhancing outliers Outlier a data item that differs from the main trend in a particular cluster. In order to gain additional insight into a data set, outliers must be investigated. The enhanced outliers are displayed using a separate texture, thus they can be easily switched on or off We use a method based on the interquartile range to define if a data item is an outlier or not.

Way of define outlier: interquartile range For each cluster, i,and dimension, j,the interquartile range,,is defined as the difference between upper and lower quartiles, -,where is the 25 th percentile and is the 75th percentile. The data items that belong to cluster i, are determined to be outliers if they fall β above or below This is done for each dimension, j,independently. The value of β may be interactively changed, but one commonly used rule when searching for outliers is to use β = Larger values of β can be used to identify more extreme outliers. 18 Enhancing outliers ClusterOutlier

19 Enhancing outliers

20 6.Feature animation

21 Feature animation ‘Feature animation' as an intuitive way of visualizing statistical properties of the clusters. Both the variance and skewness of a cluster may be used as a guidance as to where to start the analysis Uniform band, a cluster displayed at the centroid, rather than presented in its true size, will not take up as much vertical space and can therefore be used to get a good overview of the data.

22 Feature animation By animating the variance the user can quickly examine each cluster to see if the cluster is tight or loose. By animating the skewness the amount of asymmetry in the distribution is measured. µ ij is the mean of the X ijk σ ij is the standard deviation V ij is speed in which the vertical texture coordinates should be translated η and ζ are global constants used to control the speed of the animation.

23 7.Conclusions and future work

24 Thank you