Presentation is loading. Please wait.

Presentation is loading. Please wait.

An Introduction to Multivariate Data Visualization and XmdvTool

Similar presentations


Presentation on theme: "An Introduction to Multivariate Data Visualization and XmdvTool"— Presentation transcript:

1 An Introduction to Multivariate Data Visualization and XmdvTool
Matthew O. Ward Computer Science Department Worcester Polytechnic Institute This work was supported under NSF Grant IIS UC Berkeley, 09/19/00

2 What is Multivariate Data?
Each data point has N variables or observations Each observation can be: nominal or ordinal discrete or continuous scalar, vector, or tensor May or may not have spatial, temporal, or other connectivity attribute UC Berkeley, 09/19/00

3 Characteristics of a Variable
Order: grades have an order, brand names do not. Distance metric: for income, distance equals difference. For rankings, difference is not a distance metric. Absolute zero: temperature has an absolute zero, bank account balances do not. A variable can be classified by these three attributes, called Scale. Effective visualizations attempt to match the scale of the data dimension with the graphical attribute conveying it. UC Berkeley, 09/19/00

4 Sources of Multivariate Data
Sensors (e.g., images, gauges) Simulations Census or other surveys Commerce (e.g., stock market) Communication systems Spreadsheets and databases UC Berkeley, 09/19/00

5 Issues in Visualizing Multivariate Data
How many variables? How many records? Types of variables? User task (exploration, confirmation, presentation) Data feature of interest (clusters, anomalies, trends, patterns, ….) Background of user (domain expert, visualization specialist, decision-maker, ….) UC Berkeley, 09/19/00

6 Methods for Visualizing Multivariate Data
Dimensional Subsetting Dimensional Reorganization Dimensional Embedding Dimensional Reduction UC Berkeley, 09/19/00

7 Dimensional Subsetting
Scatterplot matrix displays all pairwise plots Selection allows linkage between views Clusters, trends, and correlations readily discerned between pairs of dimensions UC Berkeley, 09/19/00

8 Dimensional Reorganization
Parallel Coordinates creates parallel, rather than orthogonal, dimensions. Data point corresponds to polyline across axes Clusters, trends, and anomalies discernable as groupings or outliers, based on intercepts and slopes UC Berkeley, 09/19/00

9 Dimensional Reorganization (2)
Glyphs map data dimensions to graphical attributes Size, color, shape, and orientation are commonly used Similarities/differences in features give insights into relations UC Berkeley, 09/19/00

10 Dimensional Embedding
Dimensional stacking divides data space into bins Each N-D bin has a unique 2-D screen bin Screen space recursively divided based on bin count for each dimension Clusters and trends manifested as repeated patterns UC Berkeley, 09/19/00

11 Dimensional Reduction
Map N-D locations to M-D display space while best preserving N-D relations Approaches include MDS, PCA, and Kohonen Self Organizing Maps Relationships conveyed by position, links, color, shape, size, etc. UC Berkeley, 09/19/00

12 The Role of Selection User needs to interact with display, examine interesting patterns or anomalies, validate hypotheses Selection allows isolation of subset of data for highlighting, deleting, focussed analysis Direct (clicking on displayed items ) vs. indirect (range sliders) Screen space (2-D) vs. data space (N-D) UC Berkeley, 09/19/00

13 Demonstration of XmdvTool
UC Berkeley, 09/19/00

14 Problems with Large Data Sets
Most techniques are effective with small to moderate sized data sets Large sets (> 50K records) are increasingly common When traditional visualizations used, occlusion and clutter make interpretation difficult UC Berkeley, 09/19/00

15 One Potential Solution
Multiresolution displays with aggregation Explicit clustering Break dimensions into bins Aggregate in a particular order (datacubes) Implicit clustering Hierarchical clustering (proximity-based merging) Hierarchical partitioning (proximity-based splits) Problem: many ways to cluster, each revealing different aspects of data UC Berkeley, 09/19/00

16 Display Options For each cluster, show
Center Extents for each dimension Population Other descriptors (e.g., quartiles) Color clusters such that siblings have similar color to parents UC Berkeley, 09/19/00

17 Hierarchical Parallel Coordinates
Bands show cluster extents in each dimension Opacity conveys cluster population Color similarity indicates proximity in hierarchy UC Berkeley, 09/19/00

18 Hierarchical Scatterplots
Clusters displayed as rectangles, showing extents in 2 dimensions Color/opacity consistently used for relational and population info UC Berkeley, 09/19/00

19 Hierarchical Glyphs Star glyph with bands
Analogous to parallel coordinates, with radial rather than parallel dimensions Glyph position critical for conveying relational info UC Berkeley, 09/19/00

20 Hierarchical Dimensional Stacking
Clusters occupy multiple bins Overlaps can be reduced by increasing number of bins Cell colors can be blended, or display last cluster mapped to space UC Berkeley, 09/19/00

21 Hierarchical Star Fields
Dimensional reduction techniques commonly displayed with starfields Each cluster becomes circle/sphere in field Alternatively, can show glyph at cluster location UC Berkeley, 09/19/00

22 Navigating Hierarchies
Drill-down, roll-up operations for more or less detail Need selection operation to identify subtrees for exploration, pruning Need indications of where you are in hierarchy, and where you’ve been during exploration process UC Berkeley, 09/19/00

23 Structure-Based Brushing
Enhancement to screen-based and data-based methods Specify focus, extents, and level of detail Intuitive - wedge of tree and depth of interest Implemented by labeling/numbering terminals and propagating ranges to parents UC Berkeley, 09/19/00

24 Structure-Based Brush
White contour links terminal nodes Red wedge is extents selection Color curve is depth specification Color bar maps location in tree to unique color Direct and indirect manipulation of brush UC Berkeley, 09/19/00

25 Demonstration of Hierarchical Features in XmdvTool
UC Berkeley, 09/19/00

26 Auxiliary Tools Extent scaling to reduce occlusion of bands
Dimensional zooming - fill display with selected subspace (N-D distortion) Dynamic masking to fade out selected or unselected data Saving selected subsets Enabling/disabling dimensions Univariate displays (Tukey box plots, tree maps) UC Berkeley, 09/19/00

27 Summary Many ways to map multivariate data to images, each with strengths and weaknesses Linking between and within displays with brushing enhances static displays and combines their strengths. Hierarchies and aggregations allow visualization of large data sets Intuitive navigation, filtering, and focus critical to exploration process Each basic multivariate visualization method is readily extensible to displaying cluster information UC Berkeley, 09/19/00

28 Problems and Future Work
Many data characteristics not currently supported (e.g., text fields, records with missing entries, data quality) Navigation tool for hierarchies assumes linear order of nodes Looking at tools for dynamic reorganization Exploring 2-D or higher navigation interfaces Very large hierarchies are difficult to focus on a narrow subset Developing multiresolution interface Investigating distortion techniques for navigation UC Berkeley, 09/19/00

29 Other Future Work Projection pursuit or view recommender
Linking structure and data brushing User studies Customization based on domain Query optimization via caching and prefetching UC Berkeley, 09/19/00

30 For More Information XmdvTool available to the public domain
Both Unix and Windows support Next release will have Oracle interface, with query optimization to support exploratory operations Papers in Vis ‘94, ‘95, ‘99, Infovis ‘99, IEEE TVCG Vol. 6, No. 2, 2000 UC Berkeley, 09/19/00

31 Thanks to….. Elke Rundensteiner Ying-Huey Fua Daniel Stroe Yang Jing
Suggestions from Xmdv users NSF UC Berkeley, 09/19/00


Download ppt "An Introduction to Multivariate Data Visualization and XmdvTool"

Similar presentations


Ads by Google