Download presentation
Presentation is loading. Please wait.
1
UC Berkeley, 09/19/00 An Introduction to Multivariate Data Visualization and XmdvTool Matthew O. Ward Computer Science Department Worcester Polytechnic Institute This work was supported under NSF Grant IIS-9732897
2
UC Berkeley, 09/19/00 What is Multivariate Data? zEach data point has N variables or observations zEach observation can be: y nominal or ordinal ydiscrete or continuous yscalar, vector, or tensor zMay or may not have spatial, temporal, or other connectivity attribute
3
UC Berkeley, 09/19/00 Characteristics of a Variable zOrder: grades have an order, brand names do not. zDistance metric: for income, distance equals difference. For rankings, difference is not a distance metric. zAbsolute zero: temperature has an absolute zero, bank account balances do not. zA variable can be classified by these three attributes, called Scale. zEffective visualizations attempt to match the scale of the data dimension with the graphical attribute conveying it.
4
UC Berkeley, 09/19/00 Sources of Multivariate Data zSensors (e.g., images, gauges) zSimulations zCensus or other surveys zCommerce (e.g., stock market) zCommunication systems zSpreadsheets and databases
5
UC Berkeley, 09/19/00 Issues in Visualizing Multivariate Data zHow many variables? zHow many records? zTypes of variables? zUser task (exploration, confirmation, presentation) zData feature of interest (clusters, anomalies, trends, patterns, ….) zBackground of user (domain expert, visualization specialist, decision-maker, ….)
6
UC Berkeley, 09/19/00 Methods for Visualizing Multivariate Data zDimensional Subsetting zDimensional Reorganization zDimensional Embedding zDimensional Reduction
7
UC Berkeley, 09/19/00 Dimensional Subsetting zScatterplot matrix displays all pairwise plots zSelection allows linkage between views zClusters, trends, and correlations readily discerned between pairs of dimensions
8
UC Berkeley, 09/19/00 Dimensional Reorganization zParallel Coordinates creates parallel, rather than orthogonal, dimensions. zData point corresponds to polyline across axes zClusters, trends, and anomalies discernable as groupings or outliers, based on intercepts and slopes
9
UC Berkeley, 09/19/00 Dimensional Reorganization (2) zGlyphs map data dimensions to graphical attributes zSize, color, shape, and orientation are commonly used zSimilarities/differences in features give insights into relations
10
UC Berkeley, 09/19/00 Dimensional Embedding zDimensional stacking divides data space into bins zEach N-D bin has a unique 2-D screen bin zScreen space recursively divided based on bin count for each dimension zClusters and trends manifested as repeated patterns
11
UC Berkeley, 09/19/00 Dimensional Reduction zMap N-D locations to M-D display space while best preserving N-D relations zApproaches include MDS, PCA, and Kohonen Self Organizing Maps zRelationships conveyed by position, links, color, shape, size, etc.
12
UC Berkeley, 09/19/00 The Role of Selection zUser needs to interact with display, examine interesting patterns or anomalies, validate hypotheses zSelection allows isolation of subset of data for highlighting, deleting, focussed analysis zDirect (clicking on displayed items ) vs. indirect (range sliders) zScreen space (2-D) vs. data space (N-D)
13
UC Berkeley, 09/19/00 Demonstration of XmdvTool
14
UC Berkeley, 09/19/00 Problems with Large Data Sets zMost techniques are effective with small to moderate sized data sets zLarge sets (> 50K records) are increasingly common zWhen traditional visualizations used, occlusion and clutter make interpretation difficult
15
UC Berkeley, 09/19/00 One Potential Solution zMultiresolution displays with aggregation zExplicit clustering yBreak dimensions into bins yAggregate in a particular order (datacubes) zImplicit clustering yHierarchical clustering (proximity-based merging) yHierarchical partitioning (proximity-based splits) zProblem: many ways to cluster, each revealing different aspects of data
16
UC Berkeley, 09/19/00 Display Options zFor each cluster, show yCenter yExtents for each dimension yPopulation yOther descriptors (e.g., quartiles) zColor clusters such that siblings have similar color to parents
17
UC Berkeley, 09/19/00 Hierarchical Parallel Coordinates zBands show cluster extents in each dimension zOpacity conveys cluster population zColor similarity indicates proximity in hierarchy
18
UC Berkeley, 09/19/00 Hierarchical Scatterplots zClusters displayed as rectangles, showing extents in 2 dimensions zColor/opacity consistently used for relational and population info
19
UC Berkeley, 09/19/00 Hierarchical Glyphs zStar glyph with bands zAnalogous to parallel coordinates, with radial rather than parallel dimensions zGlyph position critical for conveying relational info
20
UC Berkeley, 09/19/00 Hierarchical Dimensional Stacking zClusters occupy multiple bins zOverlaps can be reduced by increasing number of bins zCell colors can be blended, or display last cluster mapped to space
21
UC Berkeley, 09/19/00 Hierarchical Star Fields zDimensional reduction techniques commonly displayed with starfields zEach cluster becomes circle/sphere in field zAlternatively, can show glyph at cluster location
22
UC Berkeley, 09/19/00 Navigating Hierarchies zDrill-down, roll-up operations for more or less detail zNeed selection operation to identify subtrees for exploration, pruning zNeed indications of where you are in hierarchy, and where you’ve been during exploration process
23
UC Berkeley, 09/19/00 Structure-Based Brushing zEnhancement to screen-based and data- based methods zSpecify focus, extents, and level of detail zIntuitive - wedge of tree and depth of interest zImplemented by labeling/numbering terminals and propagating ranges to parents
24
UC Berkeley, 09/19/00 Structure-Based Brush zWhite contour links terminal nodes zRed wedge is extents selection zColor curve is depth specification zColor bar maps location in tree to unique color zDirect and indirect manipulation of brush
25
UC Berkeley, 09/19/00 Demonstration of Hierarchical Features in XmdvTool
26
UC Berkeley, 09/19/00 Auxiliary Tools zExtent scaling to reduce occlusion of bands zDimensional zooming - fill display with selected subspace (N-D distortion) zDynamic masking to fade out selected or unselected data zSaving selected subsets zEnabling/disabling dimensions zUnivariate displays (Tukey box plots, tree maps)
27
UC Berkeley, 09/19/00 Summary zMany ways to map multivariate data to images, each with strengths and weaknesses zLinking between and within displays with brushing enhances static displays and combines their strengths. zHierarchies and aggregations allow visualization of large data sets zIntuitive navigation, filtering, and focus critical to exploration process zEach basic multivariate visualization method is readily extensible to displaying cluster information
28
UC Berkeley, 09/19/00 Problems and Future Work zMany data characteristics not currently supported (e.g., text fields, records with missing entries, data quality) zNavigation tool for hierarchies assumes linear order of nodes yLooking at tools for dynamic reorganization yExploring 2-D or higher navigation interfaces zVery large hierarchies are difficult to focus on a narrow subset yDeveloping multiresolution interface yInvestigating distortion techniques for navigation
29
UC Berkeley, 09/19/00 Other Future Work zProjection pursuit or view recommender zLinking structure and data brushing zUser studies zCustomization based on domain zQuery optimization via caching and prefetching
30
UC Berkeley, 09/19/00 For More Information zXmdvTool available to the public domain zBoth Unix and Windows support zNext release will have Oracle interface, with query optimization to support exploratory operations zhttp://davis.wpi.edu/~xmdv zPapers in Vis ‘94, ‘95, ‘99, Infovis ‘99, IEEE TVCG Vol. 6, No. 2, 2000
31
UC Berkeley, 09/19/00 Thanks to….. zElke Rundensteiner zYing-Huey Fua zDaniel Stroe zYang Jing zSuggestions from Xmdv users zNSF
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.