Download presentation
Presentation is loading. Please wait.
Published byEugene Jefferson Modified over 9 years ago
1
NERCOMP Workshop, Dec. 2, 2008 Information Visualization: the Other Half of Data Analysis Dr. Matthew Ward Computer Science Department Worcester Polytechnic Institute
2
NERCOMP Workshop, Dec. 2, 2008 A Data Analysis Pipeline Raw Data Processed Data Hypotheses Models Results Cleaning Filtering Transforming Statistical Analysis Pattern Rec Knowledge Disc Validation ACB D
3
NERCOMP Workshop, Dec. 2, 2008 Where Does Visualization Come In? All stages can benefit from visualization A: identify bad data, select subsets, help choose transforms (exploratory) B: help choose computational techniques, set parameters, use vision to recognize, isolate, classify patterns (exploratory) C: Superimpose derived models on data (confirmatory) D: Present results (presentation)
4
NERCOMP Workshop, Dec. 2, 2008 What do we need to know to do Information Visualization? Characteristics of data Types, size, structure Semantics, completeness, accuracy Characteristics of user Perceptual and cognitive abilities Knowledge of domain, data, tasks, tools Characteristics of graphical mappings What are possibilities Which convey data effectively and efficiently Characteristics of interactions Which support the tasks best Which are easy to learn, use, remember
5
NERCOMP Workshop, Dec. 2, 2008 Issues Regarding Data Type may indicate which graphical mappings are appropriate Nominal vs. ordinal Discrete vs. continuous Ordered vs. unordered Univariate vs. multivariate Scalar vs. vector vs. tensor Static vs. dynamic Values vs. relations Trade-offs between size and accuracy needs Different orders/structures can reveal different features/patterns
6
NERCOMP Workshop, Dec. 2, 2008 Issues Regarding Users What graphical attributes do we perceive accurately? What graphical attributes do we perceive quickly? Which combinations of attributes are separable? Coping with change blindness How can visuals support the development of accurate mental models of the data? Relative vs. absolute judgements – impact on tasks
7
NERCOMP Workshop, Dec. 2, 2008 Issues Regarding Mappings Variables include shape, size, orientation, color, texture, opacity, position, motion…. Some of these have an order, others don’t Some use up significant screen space Sensitivity to occlusion Domain customs/expectations
8
NERCOMP Workshop, Dec. 2, 2008 www3.sympatico.ca/blevis/Image10.gif
9
NERCOMP Workshop, Dec. 2, 2008 Issues Regarding Interactions Interaction critical component Many categories of techniques Navigation, selection, filtering, reconfiguring, encoding, connecting, and combinations of above Many “spaces” in which interactions can be applied Screen/pixels, data, data structures, graphical objects, graphical attributes, visualization structures
10
NERCOMP Workshop, Dec. 2, 2008 Importance of Evaluation Easy to design bad visualizations Many design rules exist – many conflict, many routinely violated 5 E’s of evaluation: effective, efficient, engaging, error tolerant, easy to learn Many styles of evaluation (qualitative and quantitative): Use/case studies Usability testing User studies Longitudinal studies Expert evaluation Heuristic evaluation
11
NERCOMP Workshop, Dec. 2, 2008 Different Rules -> Different Views Courtesy of Aisee.com
12
NERCOMP Workshop, Dec. 2, 2008 Categories of Mappings Based on data characteristics Numbers, text, graphs, software, …. Logical groupings of techniques (Keim) Standard: bars, lines, pie charts, scatterplots Geometrically transformed: landscapes, parallel coordinates Icon-based: stick figures, faces, profiles Dense pixels: recursive segments, pixel bar charts Stacked: treemaps, dimensional stacking Based on dimension management (Ward) Dimension subsetting: scatterplots, pixel-oriented methods Dimension reconfiguring: glyphs, parallel coordinates Dimension reduction: PCA, MDS, Self Organizing Maps Dimension embedding: dimensional stacking, worlds within worlds
13
NERCOMP Workshop, Dec. 2, 2008 Scatterplot Matrix Each pair of dimensions generates a single scatterplot All combinations arranged in a grid or matrix, each dimension controls a row or column Look for clusters, outliers, partial correlations, trends
14
NERCOMP Workshop, Dec. 2, 2008 Parallel Coordinates Each variable/dimension is a vertical line Bottom of line is low value, top is high Each record creates a polyline across all dimensions Similar records cluster on the screen Look for clusters, outliers, line angles, crossings
15
NERCOMP Workshop, Dec. 2, 2008 Star Glyph Glyphs are shapes whose attributes are controlled by data values Star glyph is a set of N rays spaced at equal angles Length of each ray proportional to value for that dimension Line connects all endpoints of shape Lay glyphs out in rows and columns Look for shape similarities and differences, trends
16
NERCOMP Workshop, Dec. 2, 2008 Other Types of Glyphs
17
NERCOMP Workshop, Dec. 2, 2008 Dimensional Stacking Break each dimension range into bins Break the screen into a grid using the number of bins for 2 dimensions Repeat the process for 2 more dimensions within the subimages formed by first grid, recurse through all dimensions Look for repeated patterns, outliers, trends, gaps
18
NERCOMP Workshop, Dec. 2, 2008 Pixel-Oriented Techniques Each dimension creates an image Each value controls color of a pixel Many organizations of pixels possible (raster, spiral, circle segment, space-filling curves) Reordering data can reveal interesting features, relations between dimensions
19
NERCOMP Workshop, Dec. 2, 2008 Methods to Cope with Scale Many modern datasets contain large number of records (millions and billions) and/or dimensions (hundreds and thousands) Several strategies to handle scale problems Sampling Filtering Clustering/aggregation Techniques can be automated or user- controlled
20
NERCOMP Workshop, Dec. 2, 2008 Examples of Data Clustering
21
NERCOMP Workshop, Dec. 2, 2008 Example of Dimension Clustering
22
NERCOMP Workshop, Dec. 2, 2008 Example of Data Sampling
23
NERCOMP Workshop, Dec. 2, 2008 The Visual Data Analysis (VDA) Process Overview Filter/cluster/sample Scan Select “interesting” Details on demand Link between different views
24
NERCOMP Workshop, Dec. 2, 2008 Demonstration
25
NERCOMP Workshop, Dec. 2, 2008 Summary Visualization a powerful component of the data analysis process Each stage of analysis can be enhanced Visualization can help guide computational analysis, and vice versa Multiple linked views and a rich assortment of interactions key to success
26
NERCOMP Workshop, Dec. 2, 2008 For Further Info on XmdvTool http://davis.wpi.edu/~xmdv http://davis.wpi.edu/~xmdv Contains source code, windows executable, data sets, documentation, copies of most Xmdv publications, case studies We gratefully acknowledge support for the development of XmdvTool from the National Science Foundation (IIS-9732897, IRIS-9729878, IIS-0119276, IIS-0414380, CCF-0811510, and IIS-0812027) and the National Security Agency
27
NERCOMP Workshop, Dec. 2, 2008 Questions?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.