Presentation is loading. Please wait.

Presentation is loading. Please wait.

Marti Hearst SIMS 247 SIMS 247 Lecture 4 Graphing Multivariate Information January 29, 1998.

Similar presentations


Presentation on theme: "Marti Hearst SIMS 247 SIMS 247 Lecture 4 Graphing Multivariate Information January 29, 1998."— Presentation transcript:

1 Marti Hearst SIMS 247 SIMS 247 Lecture 4 Graphing Multivariate Information January 29, 1998

2 Marti Hearst SIMS 247 Follow-up previous lecture Docuverse:Docuverse: –length of arc is proportional to number of subdirectories –radius for a given arc is long enough to contain marks for all the files in the directory Nightingale’s “coxcomb”Nightingale’s “coxcomb” –keep arc length constant –vary radius length (proportional to sqrt(freq))

3 Marti Hearst SIMS 247 Today: Multivariate Information We see a 3D worldWe see a 3D world How do we handle more than 3 variables?How do we handle more than 3 variables? –multi-functioning elements Tufte examples cinematography example –multiple views

4 Marti Hearst SIMS 247 Example Data Sets How do we handle 9 variables? –Our web access dataset –Factors involved in alcoholism ALCOHOL –USE –AVAILABILITY –CONCERN ABOUT USE –COPING MECHANISMS PERSONALITY MEASURES –EXTROVERSION –DISINHIBITION OTHER –GENDER –GPA

5 Marti Hearst SIMS 247 Graphing Multivariate Information Graphing Multivariate Information How do we handle cases with more than three variables? –Scatterplot matrices –Parallel coordinates –Multiple views –Overlay space and time –Interaction/animation across time

6 Marti Hearst SIMS 247 Multiple Variables: Scatterplot Matrices (from Wegman et al.)

7 Marti Hearst SIMS 247 Multiple Variables: Scatterplot Matrices (from Schall 95)

8 Marti Hearst SIMS 247 Multiple Views: Star Plot (Discussed in Feinberg 79. Works better with animation. Example taken from Behrans & Yu 95.)

9 Marti Hearst SIMS 247 Multiple Dimensions: Parallel Coordinates (earthquake data, color indicates longitude, y axis severity of earthquake, from Schall 95)

10 Marti Hearst SIMS 247 Multiple Dimensions: Multivariate Star Plot (from Behran & Yu 95)

11 Marti Hearst SIMS 247 Chernoff Faces Assumption: people have built-in face recognizersAssumption: people have built-in face recognizers Map variables to features of a cartoon faceMap variables to features of a cartoon face –Example: eyes location, separation, angle, shape, width –Example: entire face area, shape, nose length, mouth location, smile curve Originally tongue-in-cheek, but taken seriouslyOriginally tongue-in-cheek, but taken seriously Sometimes seems to work for small numbers of pointsSometimes seems to work for small numbers of points

12 Marti Hearst SIMS 247 Chernoff Example (Marchette) Three groups of pointsThree groups of points –each drawn from a different distribution with 5 variables First show scatter-plot matrixFirst show scatter-plot matrix Then graph with Chernoff facesThen graph with Chernoff faces –vary faces overall –vary eyes –vary mouth and eyebrows Which seems to be most effective?Which seems to be most effective?

13 Marti Hearst SIMS 247 Chernoff Experiment (Marchette)

14 Marti Hearst SIMS 247 Chernoff Experiment (Marchette)

15 Marti Hearst SIMS 247 Chernoff Experiment (Marchette)

16 Marti Hearst SIMS 247 Chernoff Experiment (Marchette)

17 Marti Hearst SIMS 247 Overlaying Space and Time (Minard’s graph of Napolean’s march through Russia)

18 Marti Hearst SIMS 247 A Detective Story (Inselberg 97) Domain: Manufacture of computer chipsDomain: Manufacture of computer chips Objectives: create batches withObjectives: create batches with –high yield (X1) –high quality (X2) Hypothesized cause of problem:Hypothesized cause of problem: –9 types of defects (X3-X12) Some physical properties (X13-X16)Some physical properties (X13-X16) Approach:Approach: –examine data for 473 batches –use interactive parallel coordinates

19 Marti Hearst SIMS 247 Multidimensional Detective Long term objectives:Long term objectives: –high quality, high yield Logical approach given the hypothesis:Logical approach given the hypothesis: –try to eliminate defects First clue:First clue: –what patterns can be found among batches with high yield and quality?

20 Marti Hearst SIMS 247 Detectives aren’t intimidated! X1 seems to be normally distributed; X2 bipolar

21 Marti Hearst SIMS 247 High quality yields obtained despite defects good batches some low X3 defect batches don’t appear here X15 breaks into two clusters (important physical property) at least one good batch with defects

22 Marti Hearst SIMS 247 Low-defect batches are not highest quality! few defects low yield, low quality

23 Marti Hearst SIMS 247 Original plot shows defect X6 behaves differently; exclude it from the 9-out-of-10 defects constraint; the best batches return

24 Marti Hearst SIMS 247 Isolate the best batches. Conclusion: defects are necessary! The very best batch has X3 and X6 defects Ensure this is not an outlier -- look at top few batches. The same result is found.

25 Marti Hearst SIMS 247 How to graph web page traversals?

26 Marti Hearst SIMS 247 References for this Lecture Visualization Techniques of Different Dimensions, John Behrens and Chong Ho Yu, 1995 http://seamonkey.ed.asu.edu/~behrens/asu/reports/compre/comp1.htmlVisualization Techniques of Different Dimensions, John Behrens and Chong Ho Yu, 1995 http://seamonkey.ed.asu.edu/~behrens/asu/reports/compre/comp1.html Feinberg, S. E. Graphical methods in statistics. American Statisticians, 33, 165- 178, 1979Feinberg, S. E. Graphical methods in statistics. American Statisticians, 33, 165- 178, 1979 Friendly, Michael, Gallery of Data Visualization. http://www.math.yorku.ca/SCS/GalleryFriendly, Michael, Gallery of Data Visualization. http://www.math.yorku.ca/SCS/Gallery –scan of Minard’s graph from Tufte 1983 –multivariate means comparison Wegman, Edward J. and Luo, Qiang. High Dimensional Clustering Using Parallel Coordinates and the Grand Tour., Conference of the German Classification Society, Freiberg, Germany, 1996. http://galaxy.gmu.edu/papers/inter96.htmlWegman, Edward J. and Luo, Qiang. High Dimensional Clustering Using Parallel Coordinates and the Grand Tour., Conference of the German Classification Society, Freiberg, Germany, 1996. http://galaxy.gmu.edu/papers/inter96.html Cook, Dennis R and Weisberg, Sanford. An Introduction to Regression Graphics, 1995. http://stat.umn.edu/~rcode/node3.htmlCook, Dennis R and Weisberg, Sanford. An Introduction to Regression Graphics, 1995. http://stat.umn.edu/~rcode/node3.html Schall, Matthew. SPSS DIAMOND: a visual exploratory data analysis tool. Perspective, 18 (2), 1995. http://www.spss.com/cool/papers/diamondw.htmlSchall, Matthew. SPSS DIAMOND: a visual exploratory data analysis tool. Perspective, 18 (2), 1995. http://www.spss.com/cool/papers/diamondw.html Marchette, David. An Investigation of Chernoff Faces for High Dimensional Data Exploration. http://farside.nswc.navy.mil/CSI803/Dave/chern.htmlMarchette, David. An Investigation of Chernoff Faces for High Dimensional Data Exploration. http://farside.nswc.navy.mil/CSI803/Dave/chern.html Chernoff, H. The use of Faces to Represent Points in k-Dimensional Space Graphically. Journal of the American Statistical Association, 68, 361-368, 1973.Chernoff, H. The use of Faces to Represent Points in k-Dimensional Space Graphically. Journal of the American Statistical Association, 68, 361-368, 1973.

27 Marti Hearst SIMS 247 Next Time: Brushing and Linking An interactive techniqueAn interactive technique Brushing:Brushing: –pick out some points from one viewpoint –see how this effects other viewpoints –(Cleveland scatterplot matrix example) Graphs must be linked togetherGraphs must be linked together

28 Marti Hearst SIMS 247 Brushing and Linking Systems VISAGE: Roth et. alVISAGE: Roth et. al Attribute Explorer: Tweedie et. alAttribute Explorer: Tweedie et. al SpotFire (IVEE): Ahlberg et. alSpotFire (IVEE): Ahlberg et. al


Download ppt "Marti Hearst SIMS 247 SIMS 247 Lecture 4 Graphing Multivariate Information January 29, 1998."

Similar presentations


Ads by Google