An Introduction to Multivariate Data Visualization and XmdvTool

Slides:



Advertisements
Similar presentations
LifeLines:Visualizing Personal Histories Plaisant, Milash, Rose, Widoff, Shneiderman Presented by Girish Kumar and Rajiv Gandhi.
Advertisements

1 Enviromatics Spatial database systems Spatial database systems Вонр. проф. д-р Александар Маркоски Технички факултет – Битола 2008 год.
Mapping Nominal Values to Numbers for Effective Visualization Presented by Matthew O. Ward Geraldine Rosario, Elke Rundensteiner, David Brown, Matthew.
Visual Analytics Research at WPI Dr. Matthew Ward and Dr. Elke Rundensteiner Computer Science Department.
1 Presented by Jean-Daniel Fekete. 2  Motivation  Mélange [Elmqvist 2008] Multiple Focus Regions.
Planning and Statistics ADVIZOR TRAINING PLANNING AND STATISTICS V.1 1 ADVIZOR Tips.
1 This work partially funded by NSF Grants IIS , IRIS and IIS Matthew O. Ward, Elke A. Rundensteiner, Jing Yang, Punit Doshi, Geraldine.
Classifier Decision Tree A decision tree classifies data by predicting the label for each record. The first element of the tree is the root node, representing.
WPI Center for Research in Exploratory Data and Information Analysis CREDIA SC4DEVO-1, July 12-15, 2004 Interactive Visual Exploration of Multivariate.
ShadyStats Project Update Mike Cora November 16, C: Information Visualization.
Multivariate Data Visualization Adapted from Slides by: Matthew O. Ward Computer Science Department Worcester Polytechnic Institute This work was supported.
Evaluating the Quality of Image Synthesis and Analysis Techniques Matthew O. Ward Computer Science Department Worcester Polytechnic Institute.
Table Lens From papers 1 and 2 By Tichomir Tenev, Ramana Rao, and Stuart K. Card.
ShadyStats Final Report Mike Cora December 19, C: Information Visualization.
UC Berkeley, 09/19/00 An Introduction to Multivariate Data Visualization and XmdvTool Matthew O. Ward Computer Science Department Worcester Polytechnic.
1 A Rank-by-Feature Framework for Interactive Exploration of Multidimensional Data Jinwook Seo, Ben Shneiderman University of Maryland Hyun Young Song.
Prefetching for Visual Data Exploration Punit R. Doshi, Elke A. Rundensteiner, Matthew O. Ward Computer Science Department Worcester Polytechnic Institute.
Info Vis: Multi-Dimensional Data Chris North cs3724: HCI.
WPI Center for Research in Exploratory Data and Information Analysis From Data to Knowledge: Exploring Industrial, Scientific, and Commercial Databases.
Data Mining Techniques
The Tutorial of Principal Component Analysis, Hierarchical Clustering, and Multidimensional Scaling Wenshan Wang.
NERCOMP Workshop, Dec. 2, 2008 Information Visualization: the Other Half of Data Analysis Dr. Matthew Ward Computer Science Department Worcester Polytechnic.
Information Design and Visualization
June 6, 2014 IAT Interaction ______________________________________________________________________________________ SCHOOL OF INTERACTIVE ARTS +
Data Exploration Chapter 9. Introduction  Where to begin?  Data exploration is data-centered query and analysis  Better understand the data and provide.
© 2010 Pearson Addison-Wesley. All rights reserved. Addison Wesley is an imprint of Designing the User Interface: Strategies for Effective Human-Computer.
Visualization Blaz Zupan Faculty of Computer & Info Science University of Ljubljana, Slovenia.
Visual Perspectives iPLANT Visual Analytics Workshop November 5-6, 2009 ;lk Visual Analytics Bernice Rogowitz Greg Abram.
Opinion to ponder… “ Since we are a visual species (especially the American culture), because of our educational system. Many of the tools currently used.
Advanced Scientific Visualization
V Material obtained from summer workshop in Guildford County, July-2014.
Copyright © 2005, Pearson Education, Inc. Slides from resources for: Designing the User Interface 4th Edition by Ben Shneiderman & Catherine Plaisant Slides.
The Table Lens: Merging Graphical and Symbolic Representations in an Interactive Focus+Context Visualization for Tabular Information Ramana Rao and Stuart.
Non-Overlapping Aggregated Multivariate Glyphs for Moving Objects Roeland Scheepens, Huub van de Wetering, Jarke J. van Wijk Presented by: David Sheets.
Info Vis: Multi-Dimensional Data Chris North cs3724: HCI.
Visualization Techniques for Multivariate Discrete and Continuous Data March 4, 2005 Rachael Brady.
Polaris: A System for Query, Analysis and Visualization of Multi- dimensional Relational Database by Chris Stolte & Pat Hanrahan presenter Andrew Trieu.
Image Classification for Automatic Annotation
3. THE LANGUAGE OF INTERFACE DESIGN. Design decisions at different levels of visual form LevelExample Pixel. Graphic atomsA, 3 _ graphic fragmentsWord,
CS 235: User Interface Design November 19 Class Meeting Department of Computer Science San Jose State University Fall 2014 Instructor: Ron Mak
CS 235: User Interface Design April 30 Class Meeting Department of Computer Science San Jose State University Spring 2015 Instructor: Ron Mak
Di Yang, Zhengyu Guo, Elke A. Rundensteiner and Matthew O. Ward Worcester Polytechnic Institute EDBT 2010, Submitted 1 A Unified Framework Supporting Interactive.
Visual Correlation Analysis of Numerical and Categorical Data on the Correlation Map Zhiyuan Zhang, Kevin T. McDonnell, Erez Zadok, Klaus Mueller.
1 Mining Images of Material Nanostructure Data Aparna S. Varde, Jianyu Liang, Elke A. Rundensteiner and Richard D. Sisson Jr. ICDCIT December 2006 Bhubaneswar,
Visualization of Washing Powder Formulation ———seeking the best ingredients of washing powder.
3/13/2016 Data Mining 1 Lecture 2-1 Data Exploration: Understanding Data Phayung Meesad, Ph.D. King Mongkut’s University of Technology North Bangkok (KMUTNB)
Applied Cartography and Introduction to GIS GEOG 2017 EL Lecture-5 Chapters 9 and 10.
CHAPTER 10 DATA EXPLORATION 10.1 Data Exploration Box 10.1 Data Visualization Descriptive Statistics Box 10.2 Descriptive Statistics Graphs.
Computational Biology
Exploring Data: Summary Statistics and Visualizations
Data Mining – Intro.
Visualizing Data and Communicating Information
Advanced Scientific Visualization
IAT 355 Trees2 ______________________________________________________________________________________.
IAT 355 Data + Multivariate Visualization
CSC420 Showing Complex Data.
Lindita Camaj Associate professor
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
Data Mining: Exploring Data
CSc4730/6730 Scientific Visualization
Document Visualization at UMBC
CSc4730/6730 Scientific Visualization
Grant Number: IIS Institution of PI: WPI PIs: Matthew O
Information Design and Visualization
User interface design.
Information Visualization (Part 1)
Data Pre-processing Lecture Notes for Chapter 2
CS639: Data Management for Data Science
Data exploration and visualization
Comp 15 - Usability & Human Factors
Presentation transcript:

An Introduction to Multivariate Data Visualization and XmdvTool Matthew O. Ward Computer Science Department Worcester Polytechnic Institute This work was supported under NSF Grant IIS-9732897 UC Berkeley, 09/19/00

What is Multivariate Data? Each data point has N variables or observations Each observation can be: nominal or ordinal discrete or continuous scalar, vector, or tensor May or may not have spatial, temporal, or other connectivity attribute UC Berkeley, 09/19/00

Characteristics of a Variable Order: grades have an order, brand names do not. Distance metric: for income, distance equals difference. For rankings, difference is not a distance metric. Absolute zero: temperature has an absolute zero, bank account balances do not. A variable can be classified by these three attributes, called Scale. Effective visualizations attempt to match the scale of the data dimension with the graphical attribute conveying it. UC Berkeley, 09/19/00

Sources of Multivariate Data Sensors (e.g., images, gauges) Simulations Census or other surveys Commerce (e.g., stock market) Communication systems Spreadsheets and databases UC Berkeley, 09/19/00

Issues in Visualizing Multivariate Data How many variables? How many records? Types of variables? User task (exploration, confirmation, presentation) Data feature of interest (clusters, anomalies, trends, patterns, ….) Background of user (domain expert, visualization specialist, decision-maker, ….) UC Berkeley, 09/19/00

Methods for Visualizing Multivariate Data Dimensional Subsetting Dimensional Reorganization Dimensional Embedding Dimensional Reduction UC Berkeley, 09/19/00

Dimensional Subsetting Scatterplot matrix displays all pairwise plots Selection allows linkage between views Clusters, trends, and correlations readily discerned between pairs of dimensions UC Berkeley, 09/19/00

Dimensional Reorganization Parallel Coordinates creates parallel, rather than orthogonal, dimensions. Data point corresponds to polyline across axes Clusters, trends, and anomalies discernable as groupings or outliers, based on intercepts and slopes UC Berkeley, 09/19/00

Dimensional Reorganization (2) Glyphs map data dimensions to graphical attributes Size, color, shape, and orientation are commonly used Similarities/differences in features give insights into relations UC Berkeley, 09/19/00

Dimensional Embedding Dimensional stacking divides data space into bins Each N-D bin has a unique 2-D screen bin Screen space recursively divided based on bin count for each dimension Clusters and trends manifested as repeated patterns UC Berkeley, 09/19/00

Dimensional Reduction Map N-D locations to M-D display space while best preserving N-D relations Approaches include MDS, PCA, and Kohonen Self Organizing Maps Relationships conveyed by position, links, color, shape, size, etc. UC Berkeley, 09/19/00

The Role of Selection User needs to interact with display, examine interesting patterns or anomalies, validate hypotheses Selection allows isolation of subset of data for highlighting, deleting, focussed analysis Direct (clicking on displayed items ) vs. indirect (range sliders) Screen space (2-D) vs. data space (N-D) UC Berkeley, 09/19/00

Demonstration of XmdvTool UC Berkeley, 09/19/00

Problems with Large Data Sets Most techniques are effective with small to moderate sized data sets Large sets (> 50K records) are increasingly common When traditional visualizations used, occlusion and clutter make interpretation difficult UC Berkeley, 09/19/00

One Potential Solution Multiresolution displays with aggregation Explicit clustering Break dimensions into bins Aggregate in a particular order (datacubes) Implicit clustering Hierarchical clustering (proximity-based merging) Hierarchical partitioning (proximity-based splits) Problem: many ways to cluster, each revealing different aspects of data UC Berkeley, 09/19/00

Display Options For each cluster, show Center Extents for each dimension Population Other descriptors (e.g., quartiles) Color clusters such that siblings have similar color to parents UC Berkeley, 09/19/00

Hierarchical Parallel Coordinates Bands show cluster extents in each dimension Opacity conveys cluster population Color similarity indicates proximity in hierarchy UC Berkeley, 09/19/00

Hierarchical Scatterplots Clusters displayed as rectangles, showing extents in 2 dimensions Color/opacity consistently used for relational and population info UC Berkeley, 09/19/00

Hierarchical Glyphs Star glyph with bands Analogous to parallel coordinates, with radial rather than parallel dimensions Glyph position critical for conveying relational info UC Berkeley, 09/19/00

Hierarchical Dimensional Stacking Clusters occupy multiple bins Overlaps can be reduced by increasing number of bins Cell colors can be blended, or display last cluster mapped to space UC Berkeley, 09/19/00

Hierarchical Star Fields Dimensional reduction techniques commonly displayed with starfields Each cluster becomes circle/sphere in field Alternatively, can show glyph at cluster location UC Berkeley, 09/19/00

Navigating Hierarchies Drill-down, roll-up operations for more or less detail Need selection operation to identify subtrees for exploration, pruning Need indications of where you are in hierarchy, and where you’ve been during exploration process UC Berkeley, 09/19/00

Structure-Based Brushing Enhancement to screen-based and data-based methods Specify focus, extents, and level of detail Intuitive - wedge of tree and depth of interest Implemented by labeling/numbering terminals and propagating ranges to parents UC Berkeley, 09/19/00

Structure-Based Brush White contour links terminal nodes Red wedge is extents selection Color curve is depth specification Color bar maps location in tree to unique color Direct and indirect manipulation of brush UC Berkeley, 09/19/00

Demonstration of Hierarchical Features in XmdvTool UC Berkeley, 09/19/00

Auxiliary Tools Extent scaling to reduce occlusion of bands Dimensional zooming - fill display with selected subspace (N-D distortion) Dynamic masking to fade out selected or unselected data Saving selected subsets Enabling/disabling dimensions Univariate displays (Tukey box plots, tree maps) UC Berkeley, 09/19/00

Summary Many ways to map multivariate data to images, each with strengths and weaknesses Linking between and within displays with brushing enhances static displays and combines their strengths. Hierarchies and aggregations allow visualization of large data sets Intuitive navigation, filtering, and focus critical to exploration process Each basic multivariate visualization method is readily extensible to displaying cluster information UC Berkeley, 09/19/00

Problems and Future Work Many data characteristics not currently supported (e.g., text fields, records with missing entries, data quality) Navigation tool for hierarchies assumes linear order of nodes Looking at tools for dynamic reorganization Exploring 2-D or higher navigation interfaces Very large hierarchies are difficult to focus on a narrow subset Developing multiresolution interface Investigating distortion techniques for navigation UC Berkeley, 09/19/00

Other Future Work Projection pursuit or view recommender Linking structure and data brushing User studies Customization based on domain Query optimization via caching and prefetching UC Berkeley, 09/19/00

For More Information XmdvTool available to the public domain Both Unix and Windows support Next release will have Oracle interface, with query optimization to support exploratory operations http://davis.wpi.edu/~xmdv Papers in Vis ‘94, ‘95, ‘99, Infovis ‘99, IEEE TVCG Vol. 6, No. 2, 2000 UC Berkeley, 09/19/00

Thanks to….. Elke Rundensteiner Ying-Huey Fua Daniel Stroe Yang Jing Suggestions from Xmdv users NSF UC Berkeley, 09/19/00