NERCOMP Workshop, Dec. 2, 2008 Information Visualization: the Other Half of Data Analysis Dr. Matthew Ward Computer Science Department Worcester Polytechnic.

Slides:



Advertisements
Similar presentations
Multi-Dimensional Data Visualization
Advertisements

The theory of data visualisation v2.0 Simon Andrews, Phil Ewels
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Lecture Slides Elementary Statistics Eleventh Edition and the Triola.
Mapping Nominal Values to Numbers for Effective Visualization Presented by Matthew O. Ward Geraldine Rosario, Elke Rundensteiner, David Brown, Matthew.
Visual Analytics Research at WPI Dr. Matthew Ward and Dr. Elke Rundensteiner Computer Science Department.
Visualization Basics CS 5764: Information Visualization Chris North.
1 This work partially funded by NSF Grants IIS , IRIS and IIS Matthew O. Ward, Elke A. Rundensteiner, Jing Yang, Punit Doshi, Geraldine.
WPI Center for Research in Exploratory Data and Information Analysis CREDIA SC4DEVO-1, July 12-15, 2004 Interactive Visual Exploration of Multivariate.
Multivariate Data Visualization Adapted from Slides by: Matthew O. Ward Computer Science Department Worcester Polytechnic Institute This work was supported.
Types of Data Displays Based on the 2008 AZ State Mathematics Standard.
Visualization and Data Mining. 2 Outline  Graphical excellence and lie factor  Representing data in 1,2, and 3-D  Representing data in 4+ dimensions.
Glyphs Presented by Bertrand Low. Presentation Overview A Taxonomy of Glyph Placement Strategies for Multidimensional Data VisualizationA Taxonomy of.
Evaluating the Quality of Image Synthesis and Analysis Techniques Matthew O. Ward Computer Science Department Worcester Polytechnic Institute.
11/30/06C:\Documents and Settings\Administrator\My Documents\533\gliff.odppage 1 Information Visualization: Glyphs CPSC 533 Topic Presentation Clarence.
Information Visualization Design for Multidimensional Data: Integrating the Rank-by-Feature Framework with Hierarchical Clustering Dissertation Defense.
Multivariate and High Dimensional Visualizations Robert Herring.
Exploring Microarray data Javier Cabrera. Outline 1.Exploratory Analysis Steps. 2.Microarray Data as Multivariate Data. 3.Dimension Reduction 4.Correlation.
Project Update: Law Enforcement Resource Allocation (LERA) Visualization System Michael Welsman-Dinelle April Webster.
1 King ABDUL AZIZ University Faculty Of Computing and Information Technology CS 454 Computer graphicsIntroduction Dr. Eng. Farag Elnagahy
Visual Computing Lecture 2 Visualization, Data, and Process.
Info Vis: Multi-Dimensional Data Chris North cs3724: HCI.
WPI Center for Research in Exploratory Data and Information Analysis From Data to Knowledge: Exploring Industrial, Scientific, and Commercial Databases.
The Tutorial of Principal Component Analysis, Hierarchical Clustering, and Multidimensional Scaling Wenshan Wang.
Information Design and Visualization
Q Q Human Computer Interaction – Part 1© 2005 Mohammed Alabdulkareem Human Computer Interaction - 1 Dr. Mohammed Alabdulkareem
BACKGROUND LEARNING AND LETTER DETECTION USING TEXTURE WITH PRINCIPAL COMPONENT ANALYSIS (PCA) CIS 601 PROJECT SUMIT BASU FALL 2004.
Data Mining Process A manifestation of best practices A systematic way to conduct DM projects Different groups has different versions Most common standard.
© 2010 Pearson Addison-Wesley. All rights reserved. Addison Wesley is an imprint of Designing the User Interface: Strategies for Effective Human-Computer.
Visualization Blaz Zupan Faculty of Computer & Info Science University of Ljubljana, Slovenia.
Robert Kosara, Helwig Hauser 1InfoVis STAR The State of the Art in Information Visualization Robert Kosara, Helwig Hauser.
A Picture Is Worth A Thousand Words. DAY 7: EXCEL CHAPTER 4 Tazin Afrin September 10,
Visual Perspectives iPLANT Visual Analytics Workshop November 5-6, 2009 ;lk Visual Analytics Bernice Rogowitz Greg Abram.
Introduction to ArcGIS for Environmental Scientists Module 1 – Data Visualization Chapter 3 – Symbology and Labeling.
An Internet of Things: People, Processes, and Products in the Spotfire Cloud Library Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist.
Geovisualization and Spatial Analysis of Cancer Data: Developing Visual-Computational Spatial Tools for Cancer Data Research Challenges for Spatial Data.
Descriptive statistics Petter Mostad Goal: Reduce data amount, keep ”information” Two uses: Data exploration: What you do for yourself when.
Info Vis: Multi-Dimensional Data Chris North cs3724: HCI.
Data Mining BY JEMINI ISLAM. Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools.
Visualization Techniques for Multivariate Discrete and Continuous Data March 4, 2005 Rachael Brady.
VizDB A tool to support Exploration of large databases By using Human Visual System To analyze mid-size to large data.
Daniel A. Keim, Hans-Peter Kriegel Institute for Computer Science, University of Munich 3/23/ VisDB: Database exploration using Multidimensional.
Polaris: A System for Query, Analysis and Visualization of Multi- dimensional Relational Database by Chris Stolte & Pat Hanrahan presenter Andrew Trieu.
Tight Coupling of Dynamic Query Filters with Starfield Displays / Spotfire.net Desktop By Chris Ahlberg and Ben Shneiderman / Spotfire Inc. IC280 5/9/02.
CS 235: User Interface Design November 19 Class Meeting Department of Computer Science San Jose State University Fall 2014 Instructor: Ron Mak
INTRODUCTION TO GIS  Used to describe computer facilities which are used to handle data referenced to the spatial domain.  Has the ability to inter-
CS 235: User Interface Design April 30 Class Meeting Department of Computer Science San Jose State University Spring 2015 Instructor: Ron Mak
© Tan,Steinbach, Kumar Introduction to Data Mining 8/05/ Data Mining: Exploring Data Lecture Notes for Chapter 3 Introduction to Data Mining by Tan,
Statistical Fundamentals: Using Microsoft Excel for Univariate and Bivariate Analysis Alfred P. Rovai Charts Overview PowerPoint Prepared by Alfred P.
3/13/2016 Data Mining 1 Lecture 2-1 Data Exploration: Understanding Data Phayung Meesad, Ph.D. King Mongkut’s University of Technology North Bangkok (KMUTNB)
Multivariate statistical methods. Multivariate methods multivariate dataset – group of n objects, m variables (as a rule n>m, if possible). confirmation.
Graphics Programming. Graphics Functions We can think of the graphics system as a black box whose inputs are function calls from an application program;
Visualization Design Principles cs5984: Information Visualization Chris North.
Exploring Data: Summary Statistics and Visualizations
The theory of data visualisation
Visualizing Data and Communicating Information
Add More Zing to your Dashboards – Creating Zing Plot Gadgets
INTRODUCTION TO GEOGRAPHICAL INFORMATION SYSTEM
IAT 355 Data + Multivariate Visualization
Ms jorgensen Unit 1: Statistics and Graphical Representations
Data Mining: Exploring Data
Computer Vision Lecture 16: Texture II
CSc4730/6730 Scientific Visualization
CSc4730/6730 Scientific Visualization
Grant Number: IIS Institution of PI: WPI PIs: Matthew O
Information Design and Visualization
CHAPTER 7: Information Visualization
An Introduction to Multivariate Data Visualization and XmdvTool
Pawandeep Kaur*, Friederike Klan*, Birgitta König-Ries*
Data exploration and visualization
Comp 15 - Usability & Human Factors
Presentation transcript:

NERCOMP Workshop, Dec. 2, 2008 Information Visualization: the Other Half of Data Analysis Dr. Matthew Ward Computer Science Department Worcester Polytechnic Institute

NERCOMP Workshop, Dec. 2, 2008 A Data Analysis Pipeline Raw Data Processed Data Hypotheses Models Results Cleaning Filtering Transforming Statistical Analysis Pattern Rec Knowledge Disc Validation ACB D

NERCOMP Workshop, Dec. 2, 2008 Where Does Visualization Come In?  All stages can benefit from visualization  A: identify bad data, select subsets, help choose transforms (exploratory)  B: help choose computational techniques, set parameters, use vision to recognize, isolate, classify patterns (exploratory)  C: Superimpose derived models on data (confirmatory)  D: Present results (presentation)

NERCOMP Workshop, Dec. 2, 2008 What do we need to know to do Information Visualization?  Characteristics of data Types, size, structure Semantics, completeness, accuracy  Characteristics of user Perceptual and cognitive abilities Knowledge of domain, data, tasks, tools  Characteristics of graphical mappings What are possibilities Which convey data effectively and efficiently  Characteristics of interactions Which support the tasks best Which are easy to learn, use, remember

NERCOMP Workshop, Dec. 2, 2008 Issues Regarding Data  Type may indicate which graphical mappings are appropriate Nominal vs. ordinal Discrete vs. continuous Ordered vs. unordered Univariate vs. multivariate Scalar vs. vector vs. tensor Static vs. dynamic Values vs. relations  Trade-offs between size and accuracy needs  Different orders/structures can reveal different features/patterns

NERCOMP Workshop, Dec. 2, 2008 Issues Regarding Users  What graphical attributes do we perceive accurately?  What graphical attributes do we perceive quickly?  Which combinations of attributes are separable?  Coping with change blindness  How can visuals support the development of accurate mental models of the data?  Relative vs. absolute judgements – impact on tasks

NERCOMP Workshop, Dec. 2, 2008 Issues Regarding Mappings  Variables include shape, size, orientation, color, texture, opacity, position, motion….  Some of these have an order, others don’t  Some use up significant screen space  Sensitivity to occlusion  Domain customs/expectations

NERCOMP Workshop, Dec. 2, 2008 www3.sympatico.ca/blevis/Image10.gif

NERCOMP Workshop, Dec. 2, 2008 Issues Regarding Interactions  Interaction critical component  Many categories of techniques Navigation, selection, filtering, reconfiguring, encoding, connecting, and combinations of above  Many “spaces” in which interactions can be applied Screen/pixels, data, data structures, graphical objects, graphical attributes, visualization structures

NERCOMP Workshop, Dec. 2, 2008 Importance of Evaluation  Easy to design bad visualizations  Many design rules exist – many conflict, many routinely violated  5 E’s of evaluation: effective, efficient, engaging, error tolerant, easy to learn  Many styles of evaluation (qualitative and quantitative): Use/case studies Usability testing User studies Longitudinal studies Expert evaluation Heuristic evaluation

NERCOMP Workshop, Dec. 2, 2008 Different Rules -> Different Views Courtesy of Aisee.com

NERCOMP Workshop, Dec. 2, 2008 Categories of Mappings  Based on data characteristics Numbers, text, graphs, software, ….  Logical groupings of techniques (Keim) Standard: bars, lines, pie charts, scatterplots Geometrically transformed: landscapes, parallel coordinates Icon-based: stick figures, faces, profiles Dense pixels: recursive segments, pixel bar charts Stacked: treemaps, dimensional stacking  Based on dimension management (Ward) Dimension subsetting: scatterplots, pixel-oriented methods Dimension reconfiguring: glyphs, parallel coordinates Dimension reduction: PCA, MDS, Self Organizing Maps Dimension embedding: dimensional stacking, worlds within worlds

NERCOMP Workshop, Dec. 2, 2008 Scatterplot Matrix  Each pair of dimensions generates a single scatterplot  All combinations arranged in a grid or matrix, each dimension controls a row or column  Look for clusters, outliers, partial correlations, trends

NERCOMP Workshop, Dec. 2, 2008 Parallel Coordinates  Each variable/dimension is a vertical line  Bottom of line is low value, top is high  Each record creates a polyline across all dimensions  Similar records cluster on the screen  Look for clusters, outliers, line angles, crossings

NERCOMP Workshop, Dec. 2, 2008 Star Glyph  Glyphs are shapes whose attributes are controlled by data values  Star glyph is a set of N rays spaced at equal angles  Length of each ray proportional to value for that dimension  Line connects all endpoints of shape  Lay glyphs out in rows and columns  Look for shape similarities and differences, trends

NERCOMP Workshop, Dec. 2, 2008 Other Types of Glyphs

NERCOMP Workshop, Dec. 2, 2008 Dimensional Stacking  Break each dimension range into bins  Break the screen into a grid using the number of bins for 2 dimensions  Repeat the process for 2 more dimensions within the subimages formed by first grid, recurse through all dimensions  Look for repeated patterns, outliers, trends, gaps

NERCOMP Workshop, Dec. 2, 2008 Pixel-Oriented Techniques  Each dimension creates an image  Each value controls color of a pixel  Many organizations of pixels possible (raster, spiral, circle segment, space-filling curves)  Reordering data can reveal interesting features, relations between dimensions

NERCOMP Workshop, Dec. 2, 2008 Methods to Cope with Scale  Many modern datasets contain large number of records (millions and billions) and/or dimensions (hundreds and thousands)  Several strategies to handle scale problems Sampling Filtering Clustering/aggregation  Techniques can be automated or user- controlled

NERCOMP Workshop, Dec. 2, 2008 Examples of Data Clustering

NERCOMP Workshop, Dec. 2, 2008 Example of Dimension Clustering

NERCOMP Workshop, Dec. 2, 2008 Example of Data Sampling

NERCOMP Workshop, Dec. 2, 2008 The Visual Data Analysis (VDA) Process  Overview  Filter/cluster/sample  Scan  Select “interesting”  Details on demand  Link between different views

NERCOMP Workshop, Dec. 2, 2008 Demonstration

NERCOMP Workshop, Dec. 2, 2008 Summary  Visualization a powerful component of the data analysis process  Each stage of analysis can be enhanced  Visualization can help guide computational analysis, and vice versa  Multiple linked views and a rich assortment of interactions key to success

NERCOMP Workshop, Dec. 2, 2008 For Further Info on XmdvTool   Contains source code, windows executable, data sets, documentation, copies of most Xmdv publications, case studies  We gratefully acknowledge support for the development of XmdvTool from the National Science Foundation (IIS , IRIS , IIS , IIS , CCF , and IIS ) and the National Security Agency

NERCOMP Workshop, Dec. 2, 2008 Questions?