Marti Hearst SIMS 247 SIMS 247 Lecture 4 Graphing Multivariate Information January 29, 1998.

Slides:



Advertisements
Similar presentations
Multi-Dimensional Data Visualization
Advertisements

Information Visualization Survey
Multidimensional Detective Alfred Inselberg Presented By Rajiv Gandhi and Girish Kumar.
Sep 23, 2013 IAT Data ______________________________________________________________________________________ SCHOOL OF INTERACTIVE ARTS + TECHNOLOGY.
Graphical Examination of Data Jaakko Leppänen
Multidimensional data processing. Multivariate data consist of several variables for each observation. Actually, serious data is always multivariate.
Rolling the Dice: Multidimensional Visual Exploration using Scatterplot Matrix Navigation 1 Niklas Elmqvist | Purdue University Pierre Dragicevic | INRIA.
Visual Analytics Research at WPI Dr. Matthew Ward and Dr. Elke Rundensteiner Computer Science Department.
Polaris: A System for Query, Analysis and Visualization of Multi-dimensional Relational Databases Presented by Darren Gates for ICS 280.
Multivariate Methods Pattern Recognition and Hypothesis Testing.
SIMS 247 Information Visualization and Presentation Prof. Marti Hearst September 14, 2000.
1 SIMS 247: Information Visualization and Presentation Marti Hearst Sept 21, 2005.
ENV 2006 CS3.1 Envisioning Information: Case Study 3 Data Exploration with Parallel Coordinates.
Visualization of Multidimensional Multivariate Large Dataset Presented by: Zhijian Pan University of Maryland.
Geog 463: GIS Workshop May 17, 2006 Exploratory Spatial Data Analysis.
Visualization and Data Mining. 2 Outline  Graphical excellence and lie factor  Representing data in 1,2, and 3-D  Representing data in 4+ dimensions.
Marti Hearst SIMS 247 SIMS 247 Lecture 5 Brushing and Linking February 3, 1998.
1 i247: Information Visualization and Presentation Marti Hearst Interactive Multidimensional Visualization.
Marti Hearst SIMS 247 SIMS 247 Lecture 3 Graphing Basics, Continued January 27, 1998.
Multidimensional Detective Alfred Inselberg Presented By Cassie Thomas.
“Exploring High-D Spaces with Multiform Matrices and Small Multiples” Mudit Agrawal Nathaniel Ayewah MacEachren, A., Dai, X., Hardisty, F., Guo, D., and.
1 18 April 2007 vizNET-LEEDS-PRES A Rough Guide to Data Visualization – Part 2 VizNET 2007 Annual Event Ken Brodlie School of Computing University.
SIMS 247 Information Visualization and Presentation Marti Hearst February 15, 2002.
Multivariate and High Dimensional Visualizations Robert Herring.
10/17/071 Read: Ch. 15, GSF Comparing Ecological Communities Part Two: Ordination.
Visual Analytics and the Geometry of Thought— Spatial Intelligence through Sapient Interfaces Alexander Klippel & Frank Hardisty Department of Geography,
Slide 1 Testing Multivariate Assumptions The multivariate statistical techniques which we will cover in this class require one or more the following assumptions.
WPI Center for Research in Exploratory Data and Information Analysis From Data to Knowledge: Exploring Industrial, Scientific, and Commercial Databases.
Information Visualization in Data Mining S.T. Balke Department of Chemical Engineering and Applied Chemistry University of Toronto.
Wang, Z., et al. Presented by: Kayla Henneman October 27, 2014 WHO IS HERE: LOCATION AWARE FACE RECOGNITION.
By LaBRI – INRIA Information Visualization Team. Tulip 2010 – version Tulip is an information visualization framework dedicated to the analysis.
Exploratory Data Analysis. Computing Science, University of Aberdeen2 Introduction Applying data mining (InfoVis as well) techniques requires gaining.
DATA MINING from data to information Ronald Westra Dep. Mathematics Knowledge Engineering Maastricht University.
Basic concepts in ordination
LECTURE UNIT 7 Understanding Relationships Among Variables Scatterplots and correlation Fitting a straight line to bivariate data.
BACKGROUND LEARNING AND LETTER DETECTION USING TEXTURE WITH PRINCIPAL COMPONENT ANALYSIS (PCA) CIS 601 PROJECT SUMIT BASU FALL 2004.
GEOG3025 Exploratory analysis of neighbourhood data.
1 Multidimensional Detective Alfred Inselberg, Multidimensional Graphs Ltd Tel Aviv University, Israel Presented by Yimeng Dou
Visual Perspectives iPLANT Visual Analytics Workshop November 5-6, 2009 ;lk Visual Analytics Bernice Rogowitz Greg Abram.
Targeted Projection Pursuit Click here for an introduction.
Opinion to ponder… “ Since we are a visual species (especially the American culture), because of our educational system. Many of the tools currently used.
Tan,Steinbach, Kumar: Exploratory Data Analysis (with modifications by Ch. Eick) Data Mining: “New” Teaching Road Map 1. Introduction to Data Mining and.
The Table Lens: Merging Graphical and Symbolic Representations in an Interactive Focus+Context Visualization for Tabular Information Ramana Rao and Stuart.
28. Multiple regression The Practice of Statistics in the Life Sciences Second Edition.
CS 235: User Interface Design November 19 Class Meeting Department of Computer Science San Jose State University Fall 2014 Instructor: Ron Mak
CS 235: User Interface Design April 30 Class Meeting Department of Computer Science San Jose State University Spring 2015 Instructor: Ron Mak
© Tan,Steinbach, Kumar Introduction to Data Mining 8/05/ Data Mining: Exploring Data Lecture Notes for Chapter 3 Introduction to Data Mining by Tan,
Data Science and Big Data Analytics Chap 3: Data Analytics Using R
Design, measurement, analysis and visualization in SyStat Chong Ho Yu, Ph.D. Instruction, Research and Infrastructure Support Arizona State University.
LSSG Green Belt Training Overview of Charts and Graphs: Mini Case.
Lucent Technologies - Proprietary 1 Interactive Pattern Discovery with Mirage Mirage uses exploratory visualization, intuitive graphical operations to.
Data Visualization.
Multivariate Visualization. Projection Distortion.
Visualization of Washing Powder Formulation ———seeking the best ingredients of washing powder.
Multi-Dimensional Data Visualization
3/13/2016 Data Mining 1 Lecture 2-1 Data Exploration: Understanding Data Phayung Meesad, Ph.D. King Mongkut’s University of Technology North Bangkok (KMUTNB)
Multi-Dimensional Data Visualization cs5984: Information Visualization Chris North.
Multivariate statistical methods. Multivariate methods multivariate dataset – group of n objects, m variables (as a rule n>m, if possible). confirmation.
Mulidimensional Detective “Multidimensional” : multivariate, many parameters “Detective” : focus is on the “discovery process”, finding patterns and trends.
Statistical Exploratory Analysis with “EnQuireR” 1.Introduction 2.Installation 3.How to 4.Report.
Exploring Data: Summary Statistics and Visualizations
SIMS 247 Lecture 7 Simultaneous Multiple Views
IAT 355 Data + Multivariate Visualization
CHAPTER 29: Multiple Regression*
Data Mining: Exploring Data
Data Mining: “New” Teaching Road Map
cs5984: Information Visualization Chris North
Visualization of Content Information in Networks using GlyphNet
Data exploration and visualization
Ungraded quiz Unit 5.
Presentation transcript:

Marti Hearst SIMS 247 SIMS 247 Lecture 4 Graphing Multivariate Information January 29, 1998

Marti Hearst SIMS 247 Follow-up previous lecture Docuverse:Docuverse: –length of arc is proportional to number of subdirectories –radius for a given arc is long enough to contain marks for all the files in the directory Nightingale’s “coxcomb”Nightingale’s “coxcomb” –keep arc length constant –vary radius length (proportional to sqrt(freq))

Marti Hearst SIMS 247 Today: Multivariate Information We see a 3D worldWe see a 3D world How do we handle more than 3 variables?How do we handle more than 3 variables? –multi-functioning elements Tufte examples cinematography example –multiple views

Marti Hearst SIMS 247 Example Data Sets How do we handle 9 variables? –Our web access dataset –Factors involved in alcoholism ALCOHOL –USE –AVAILABILITY –CONCERN ABOUT USE –COPING MECHANISMS PERSONALITY MEASURES –EXTROVERSION –DISINHIBITION OTHER –GENDER –GPA

Marti Hearst SIMS 247 Graphing Multivariate Information Graphing Multivariate Information How do we handle cases with more than three variables? –Scatterplot matrices –Parallel coordinates –Multiple views –Overlay space and time –Interaction/animation across time

Marti Hearst SIMS 247 Multiple Variables: Scatterplot Matrices (from Wegman et al.)

Marti Hearst SIMS 247 Multiple Variables: Scatterplot Matrices (from Schall 95)

Marti Hearst SIMS 247 Multiple Views: Star Plot (Discussed in Feinberg 79. Works better with animation. Example taken from Behrans & Yu 95.)

Marti Hearst SIMS 247 Multiple Dimensions: Parallel Coordinates (earthquake data, color indicates longitude, y axis severity of earthquake, from Schall 95)

Marti Hearst SIMS 247 Multiple Dimensions: Multivariate Star Plot (from Behran & Yu 95)

Marti Hearst SIMS 247 Chernoff Faces Assumption: people have built-in face recognizersAssumption: people have built-in face recognizers Map variables to features of a cartoon faceMap variables to features of a cartoon face –Example: eyes location, separation, angle, shape, width –Example: entire face area, shape, nose length, mouth location, smile curve Originally tongue-in-cheek, but taken seriouslyOriginally tongue-in-cheek, but taken seriously Sometimes seems to work for small numbers of pointsSometimes seems to work for small numbers of points

Marti Hearst SIMS 247 Chernoff Example (Marchette) Three groups of pointsThree groups of points –each drawn from a different distribution with 5 variables First show scatter-plot matrixFirst show scatter-plot matrix Then graph with Chernoff facesThen graph with Chernoff faces –vary faces overall –vary eyes –vary mouth and eyebrows Which seems to be most effective?Which seems to be most effective?

Marti Hearst SIMS 247 Chernoff Experiment (Marchette)

Marti Hearst SIMS 247 Chernoff Experiment (Marchette)

Marti Hearst SIMS 247 Chernoff Experiment (Marchette)

Marti Hearst SIMS 247 Chernoff Experiment (Marchette)

Marti Hearst SIMS 247 Overlaying Space and Time (Minard’s graph of Napolean’s march through Russia)

Marti Hearst SIMS 247 A Detective Story (Inselberg 97) Domain: Manufacture of computer chipsDomain: Manufacture of computer chips Objectives: create batches withObjectives: create batches with –high yield (X1) –high quality (X2) Hypothesized cause of problem:Hypothesized cause of problem: –9 types of defects (X3-X12) Some physical properties (X13-X16)Some physical properties (X13-X16) Approach:Approach: –examine data for 473 batches –use interactive parallel coordinates

Marti Hearst SIMS 247 Multidimensional Detective Long term objectives:Long term objectives: –high quality, high yield Logical approach given the hypothesis:Logical approach given the hypothesis: –try to eliminate defects First clue:First clue: –what patterns can be found among batches with high yield and quality?

Marti Hearst SIMS 247 Detectives aren’t intimidated! X1 seems to be normally distributed; X2 bipolar

Marti Hearst SIMS 247 High quality yields obtained despite defects good batches some low X3 defect batches don’t appear here X15 breaks into two clusters (important physical property) at least one good batch with defects

Marti Hearst SIMS 247 Low-defect batches are not highest quality! few defects low yield, low quality

Marti Hearst SIMS 247 Original plot shows defect X6 behaves differently; exclude it from the 9-out-of-10 defects constraint; the best batches return

Marti Hearst SIMS 247 Isolate the best batches. Conclusion: defects are necessary! The very best batch has X3 and X6 defects Ensure this is not an outlier -- look at top few batches. The same result is found.

Marti Hearst SIMS 247 How to graph web page traversals?

Marti Hearst SIMS 247 References for this Lecture Visualization Techniques of Different Dimensions, John Behrens and Chong Ho Yu, Techniques of Different Dimensions, John Behrens and Chong Ho Yu, Feinberg, S. E. Graphical methods in statistics. American Statisticians, 33, , 1979Feinberg, S. E. Graphical methods in statistics. American Statisticians, 33, , 1979 Friendly, Michael, Gallery of Data Visualization. Michael, Gallery of Data Visualization. –scan of Minard’s graph from Tufte 1983 –multivariate means comparison Wegman, Edward J. and Luo, Qiang. High Dimensional Clustering Using Parallel Coordinates and the Grand Tour., Conference of the German Classification Society, Freiberg, Germany, Edward J. and Luo, Qiang. High Dimensional Clustering Using Parallel Coordinates and the Grand Tour., Conference of the German Classification Society, Freiberg, Germany, Cook, Dennis R and Weisberg, Sanford. An Introduction to Regression Graphics, Dennis R and Weisberg, Sanford. An Introduction to Regression Graphics, Schall, Matthew. SPSS DIAMOND: a visual exploratory data analysis tool. Perspective, 18 (2), Matthew. SPSS DIAMOND: a visual exploratory data analysis tool. Perspective, 18 (2), Marchette, David. An Investigation of Chernoff Faces for High Dimensional Data Exploration. David. An Investigation of Chernoff Faces for High Dimensional Data Exploration. Chernoff, H. The use of Faces to Represent Points in k-Dimensional Space Graphically. Journal of the American Statistical Association, 68, , 1973.Chernoff, H. The use of Faces to Represent Points in k-Dimensional Space Graphically. Journal of the American Statistical Association, 68, , 1973.

Marti Hearst SIMS 247 Next Time: Brushing and Linking An interactive techniqueAn interactive technique Brushing:Brushing: –pick out some points from one viewpoint –see how this effects other viewpoints –(Cleveland scatterplot matrix example) Graphs must be linked togetherGraphs must be linked together

Marti Hearst SIMS 247 Brushing and Linking Systems VISAGE: Roth et. alVISAGE: Roth et. al Attribute Explorer: Tweedie et. alAttribute Explorer: Tweedie et. al SpotFire (IVEE): Ahlberg et. alSpotFire (IVEE): Ahlberg et. al