UC Berkeley, 09/19/00 An Introduction to Multivariate Data Visualization and XmdvTool Matthew O. Ward Computer Science Department Worcester Polytechnic.

Slides:



Advertisements
Similar presentations
1 SDMIV Data Visualization - A Very Rough Guide Ken Brodlie University of Leeds.
Advertisements

Multi-Dimensional Data Visualization
Jun 2, 2014 IAT Trees2 Chapter 3.2 of Spence ______________________________________________________________________________________ SCHOOL OF INTERACTIVE.
LifeLines:Visualizing Personal Histories Plaisant, Milash, Rose, Widoff, Shneiderman Presented by Girish Kumar and Rajiv Gandhi.
Image Indexing and Retrieval using Moment Invariants Imran Ahmad School of Computer Science University of Windsor – Canada.
Mapping Nominal Values to Numbers for Effective Visualization Presented by Matthew O. Ward Geraldine Rosario, Elke Rundensteiner, David Brown, Matthew.
Visualizing and Exploring Data Summary statistics for data (mean, median, mode, quartile, variance, skewnes) Distribution of values for single variables.
Visual Analytics Research at WPI Dr. Matthew Ward and Dr. Elke Rundensteiner Computer Science Department.
Multivariate Methods Pattern Recognition and Hypothesis Testing.
1 Presented by Jean-Daniel Fekete. 2  Motivation  Mélange [Elmqvist 2008] Multiple Focus Regions.
1 This work partially funded by NSF Grants IIS , IRIS and IIS Matthew O. Ward, Elke A. Rundensteiner, Jing Yang, Punit Doshi, Geraldine.
Classifier Decision Tree A decision tree classifies data by predicting the label for each record. The first element of the tree is the root node, representing.
WPI Center for Research in Exploratory Data and Information Analysis CREDIA SC4DEVO-1, July 12-15, 2004 Interactive Visual Exploration of Multivariate.
Multivariate Data Visualization Adapted from Slides by: Matthew O. Ward Computer Science Department Worcester Polytechnic Institute This work was supported.
Evaluating the Quality of Image Synthesis and Analysis Techniques Matthew O. Ward Computer Science Department Worcester Polytechnic Institute.
Table Lens From papers 1 and 2 By Tichomir Tenev, Ramana Rao, and Stuart K. Card.
ShadyStats Final Report Mike Cora December 19, C: Information Visualization.
A Strategy Selection Framework for Adaptive Prefetching in Visual Exploration Punit R. Doshi, Geraldine E. Rosario, Elke A. Rundensteiner, and Matthew.
Visual Computing Lecture 2 Visualization, Data, and Process.
1 A Rank-by-Feature Framework for Interactive Exploration of Multidimensional Data Jinwook Seo, Ben Shneiderman University of Maryland Hyun Young Song.
Prefetching for Visual Data Exploration Punit R. Doshi, Elke A. Rundensteiner, Matthew O. Ward Computer Science Department Worcester Polytechnic Institute.
Info Vis: Multi-Dimensional Data Chris North cs3724: HCI.
WPI Center for Research in Exploratory Data and Information Analysis From Data to Knowledge: Exploring Industrial, Scientific, and Commercial Databases.
The Tutorial of Principal Component Analysis, Hierarchical Clustering, and Multidimensional Scaling Wenshan Wang.
NERCOMP Workshop, Dec. 2, 2008 Information Visualization: the Other Half of Data Analysis Dr. Matthew Ward Computer Science Department Worcester Polytechnic.
By LaBRI – INRIA Information Visualization Team. Tulip 2010 – version Tulip is an information visualization framework dedicated to the analysis.
Exploratory Data Analysis. Computing Science, University of Aberdeen2 Introduction Applying data mining (InfoVis as well) techniques requires gaining.
Information Design and Visualization
June 6, 2014 IAT Interaction ______________________________________________________________________________________ SCHOOL OF INTERACTIVE ARTS +
Data Exploration Chapter 9. Introduction  Where to begin?  Data exploration is data-centered query and analysis  Better understand the data and provide.
© 2010 Pearson Addison-Wesley. All rights reserved. Addison Wesley is an imprint of Designing the User Interface: Strategies for Effective Human-Computer.
Visualization Blaz Zupan Faculty of Computer & Info Science University of Ljubljana, Slovenia.
Robert Kosara, Helwig Hauser 1InfoVis STAR The State of the Art in Information Visualization Robert Kosara, Helwig Hauser.
Clustering II. 2 Finite Mixtures Model data using a mixture of distributions –Each distribution represents one cluster –Each distribution gives probabilities.
Visual Perspectives iPLANT Visual Analytics Workshop November 5-6, 2009 ;lk Visual Analytics Bernice Rogowitz Greg Abram.
Opinion to ponder… “ Since we are a visual species (especially the American culture), because of our educational system. Many of the tools currently used.
Advanced Scientific Visualization
Visualizing Tabular Data CS 4390/5390 Data Visualization Shirley Moore, Instructor September 29,
V Material obtained from summer workshop in Guildford County, July-2014.
Copyright © 2005, Pearson Education, Inc. Slides from resources for: Designing the User Interface 4th Edition by Ben Shneiderman & Catherine Plaisant Slides.
The Table Lens: Merging Graphical and Symbolic Representations in an Interactive Focus+Context Visualization for Tabular Information Ramana Rao and Stuart.
Non-Overlapping Aggregated Multivariate Glyphs for Moving Objects Roeland Scheepens, Huub van de Wetering, Jarke J. van Wijk Presented by: David Sheets.
Info Vis: Multi-Dimensional Data Chris North cs3724: HCI.
VisDB: Database Exploration Using Multidimensional Visualization Maithili Narasimha 4/24/2001.
Visualization Techniques for Multivariate Discrete and Continuous Data March 4, 2005 Rachael Brady.
Innovative UI Ideas Marti Hearst SIMS 213, UI Design & Development April 20, 1999.
Polaris: A System for Query, Analysis and Visualization of Multi- dimensional Relational Database by Chris Stolte & Pat Hanrahan presenter Andrew Trieu.
16.1 Vis_2002 Data Visualization Lecture 14 Information Visualization Part 1.
Image Classification for Automatic Annotation
Tight Coupling of Dynamic Query Filters with Starfield Displays / Spotfire.net Desktop By Chris Ahlberg and Ben Shneiderman / Spotfire Inc. IC280 5/9/02.
CS 235: User Interface Design November 19 Class Meeting Department of Computer Science San Jose State University Fall 2014 Instructor: Ron Mak
CS 235: User Interface Design April 30 Class Meeting Department of Computer Science San Jose State University Spring 2015 Instructor: Ron Mak
© Tan,Steinbach, Kumar Introduction to Data Mining 8/05/ Data Mining: Exploring Data Lecture Notes for Chapter 3 Introduction to Data Mining by Tan,
Computational Biology Clustering Parts taken from Introduction to Data Mining by Tan, Steinbach, Kumar Lecture Slides Week 9.
Visual Correlation Analysis of Numerical and Categorical Data on the Correlation Map Zhiyuan Zhang, Kevin T. McDonnell, Erez Zadok, Klaus Mueller.
Visualization of Washing Powder Formulation ———seeking the best ingredients of washing powder.
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
3/13/2016 Data Mining 1 Lecture 2-1 Data Exploration: Understanding Data Phayung Meesad, Ph.D. King Mongkut’s University of Technology North Bangkok (KMUTNB)
Applied Cartography and Introduction to GIS GEOG 2017 EL Lecture-5 Chapters 9 and 10.
Advanced Scientific Visualization
IAT 355 Trees2 ______________________________________________________________________________________.
CSC420 Showing Complex Data.
Lindita Camaj Associate professor
Grant Number: IIS Institution of PI: WPI PIs: Matthew O
Information Design and Visualization
Information Visualization (Part 1)
An Introduction to Multivariate Data Visualization and XmdvTool
Data Pre-processing Lecture Notes for Chapter 2
Data exploration and visualization
Comp 15 - Usability & Human Factors
Presentation transcript:

UC Berkeley, 09/19/00 An Introduction to Multivariate Data Visualization and XmdvTool Matthew O. Ward Computer Science Department Worcester Polytechnic Institute This work was supported under NSF Grant IIS

UC Berkeley, 09/19/00 What is Multivariate Data? zEach data point has N variables or observations zEach observation can be: y nominal or ordinal ydiscrete or continuous yscalar, vector, or tensor zMay or may not have spatial, temporal, or other connectivity attribute

UC Berkeley, 09/19/00 Characteristics of a Variable zOrder: grades have an order, brand names do not. zDistance metric: for income, distance equals difference. For rankings, difference is not a distance metric. zAbsolute zero: temperature has an absolute zero, bank account balances do not. zA variable can be classified by these three attributes, called Scale. zEffective visualizations attempt to match the scale of the data dimension with the graphical attribute conveying it.

UC Berkeley, 09/19/00 Sources of Multivariate Data zSensors (e.g., images, gauges) zSimulations zCensus or other surveys zCommerce (e.g., stock market) zCommunication systems zSpreadsheets and databases

UC Berkeley, 09/19/00 Issues in Visualizing Multivariate Data zHow many variables? zHow many records? zTypes of variables? zUser task (exploration, confirmation, presentation) zData feature of interest (clusters, anomalies, trends, patterns, ….) zBackground of user (domain expert, visualization specialist, decision-maker, ….)

UC Berkeley, 09/19/00 Methods for Visualizing Multivariate Data zDimensional Subsetting zDimensional Reorganization zDimensional Embedding zDimensional Reduction

UC Berkeley, 09/19/00 Dimensional Subsetting zScatterplot matrix displays all pairwise plots zSelection allows linkage between views zClusters, trends, and correlations readily discerned between pairs of dimensions

UC Berkeley, 09/19/00 Dimensional Reorganization zParallel Coordinates creates parallel, rather than orthogonal, dimensions. zData point corresponds to polyline across axes zClusters, trends, and anomalies discernable as groupings or outliers, based on intercepts and slopes

UC Berkeley, 09/19/00 Dimensional Reorganization (2) zGlyphs map data dimensions to graphical attributes zSize, color, shape, and orientation are commonly used zSimilarities/differences in features give insights into relations

UC Berkeley, 09/19/00 Dimensional Embedding zDimensional stacking divides data space into bins zEach N-D bin has a unique 2-D screen bin zScreen space recursively divided based on bin count for each dimension zClusters and trends manifested as repeated patterns

UC Berkeley, 09/19/00 Dimensional Reduction zMap N-D locations to M-D display space while best preserving N-D relations zApproaches include MDS, PCA, and Kohonen Self Organizing Maps zRelationships conveyed by position, links, color, shape, size, etc.

UC Berkeley, 09/19/00 The Role of Selection zUser needs to interact with display, examine interesting patterns or anomalies, validate hypotheses zSelection allows isolation of subset of data for highlighting, deleting, focussed analysis zDirect (clicking on displayed items ) vs. indirect (range sliders) zScreen space (2-D) vs. data space (N-D)

UC Berkeley, 09/19/00 Demonstration of XmdvTool

UC Berkeley, 09/19/00 Problems with Large Data Sets zMost techniques are effective with small to moderate sized data sets zLarge sets (> 50K records) are increasingly common zWhen traditional visualizations used, occlusion and clutter make interpretation difficult

UC Berkeley, 09/19/00 One Potential Solution zMultiresolution displays with aggregation zExplicit clustering yBreak dimensions into bins yAggregate in a particular order (datacubes) zImplicit clustering yHierarchical clustering (proximity-based merging) yHierarchical partitioning (proximity-based splits) zProblem: many ways to cluster, each revealing different aspects of data

UC Berkeley, 09/19/00 Display Options zFor each cluster, show yCenter yExtents for each dimension yPopulation yOther descriptors (e.g., quartiles) zColor clusters such that siblings have similar color to parents

UC Berkeley, 09/19/00 Hierarchical Parallel Coordinates zBands show cluster extents in each dimension zOpacity conveys cluster population zColor similarity indicates proximity in hierarchy

UC Berkeley, 09/19/00 Hierarchical Scatterplots zClusters displayed as rectangles, showing extents in 2 dimensions zColor/opacity consistently used for relational and population info

UC Berkeley, 09/19/00 Hierarchical Glyphs zStar glyph with bands zAnalogous to parallel coordinates, with radial rather than parallel dimensions zGlyph position critical for conveying relational info

UC Berkeley, 09/19/00 Hierarchical Dimensional Stacking zClusters occupy multiple bins zOverlaps can be reduced by increasing number of bins zCell colors can be blended, or display last cluster mapped to space

UC Berkeley, 09/19/00 Hierarchical Star Fields zDimensional reduction techniques commonly displayed with starfields zEach cluster becomes circle/sphere in field zAlternatively, can show glyph at cluster location

UC Berkeley, 09/19/00 Navigating Hierarchies zDrill-down, roll-up operations for more or less detail zNeed selection operation to identify subtrees for exploration, pruning zNeed indications of where you are in hierarchy, and where you’ve been during exploration process

UC Berkeley, 09/19/00 Structure-Based Brushing zEnhancement to screen-based and data- based methods zSpecify focus, extents, and level of detail zIntuitive - wedge of tree and depth of interest zImplemented by labeling/numbering terminals and propagating ranges to parents

UC Berkeley, 09/19/00 Structure-Based Brush zWhite contour links terminal nodes zRed wedge is extents selection zColor curve is depth specification zColor bar maps location in tree to unique color zDirect and indirect manipulation of brush

UC Berkeley, 09/19/00 Demonstration of Hierarchical Features in XmdvTool

UC Berkeley, 09/19/00 Auxiliary Tools zExtent scaling to reduce occlusion of bands zDimensional zooming - fill display with selected subspace (N-D distortion) zDynamic masking to fade out selected or unselected data zSaving selected subsets zEnabling/disabling dimensions zUnivariate displays (Tukey box plots, tree maps)

UC Berkeley, 09/19/00 Summary zMany ways to map multivariate data to images, each with strengths and weaknesses zLinking between and within displays with brushing enhances static displays and combines their strengths. zHierarchies and aggregations allow visualization of large data sets zIntuitive navigation, filtering, and focus critical to exploration process zEach basic multivariate visualization method is readily extensible to displaying cluster information

UC Berkeley, 09/19/00 Problems and Future Work zMany data characteristics not currently supported (e.g., text fields, records with missing entries, data quality) zNavigation tool for hierarchies assumes linear order of nodes yLooking at tools for dynamic reorganization yExploring 2-D or higher navigation interfaces zVery large hierarchies are difficult to focus on a narrow subset yDeveloping multiresolution interface yInvestigating distortion techniques for navigation

UC Berkeley, 09/19/00 Other Future Work zProjection pursuit or view recommender zLinking structure and data brushing zUser studies zCustomization based on domain zQuery optimization via caching and prefetching

UC Berkeley, 09/19/00 For More Information zXmdvTool available to the public domain zBoth Unix and Windows support zNext release will have Oracle interface, with query optimization to support exploratory operations zhttp://davis.wpi.edu/~xmdv zPapers in Vis ‘94, ‘95, ‘99, Infovis ‘99, IEEE TVCG Vol. 6, No. 2, 2000

UC Berkeley, 09/19/00 Thanks to….. zElke Rundensteiner zYing-Huey Fua zDaniel Stroe zYang Jing zSuggestions from Xmdv users zNSF