Graph Visualization: Extensions 1 Presented by Dave Fuhry Yang Zhang
Outline Some Visualization Tools Why visualization? (Re-motivation) Challenges Information Visualization Data Types TreeMaps Handling high dimension – PCA, Co-Clustering, Parallel Coordinates, Grand Tour PRISM-HD: APSS plot, CSV Applications 1: Disaster (Geodesic, content) Applications 2: Social Network Analysis 2
Some Visualization Tools 3 Gephi Prefuse Gnuplot GraphViz matplotlib NodeXL Pajek d3 Sigma.js Cobweb InfoViz Cytoscape Guess NetworkX Force-Directed Graph Interactive GUI Weka Orange
Outline Same challenges as with graph layout Layout – Representing items, their attributes, and structure. Scale – “Pixel wall”, but Big Data scales to billions of records. – Shneiderman ’08: Billion records into a Million pixels Interaction – Enable user to explore and get insight
Set ASet BSet CSet D XYXYXYXY [Anscombe 73] Summary StatisticsLinear Regression u X = 9.0σ X = 3.317Y 2 = X u Y = 7.5σ Y = 2.03R 2 = 0.67 Slides courtesy: Jeffrey Stanford: A Brief Introduction to Data Visualization
Set A Set C Set D Set B XX Y Y Slides courtesy: Jeffrey Stanford: A Brief Introduction to Data Visualization
Slide courtesy: Ben UMD: Information Visualization for Knowledge Discovery.
Wattenberg [Shneiderman ‘92]
Wattenberg 1998 rectangle size: market cap (Q) rectangle position: market sector (N), market cap (Q) color hue: loss vs. gain (N, O) color value: magnitude of loss or gain (Q)
Dimensionality Reduction Multidimensional scaling, e.g. PCA Self-organizing map Image credit: Matthias Scholz,
Parallel Coordinates Draw vertical line for each dimension Item drawn as line through dimensions Figures from Xiang, Fuhry, Jin, Zhao, Huang: Visualizing Clusters in Parallel Coordinates…, PAKDD ‘12
Grand Tour Visualize HDD with 2D scatterplots “Tour” randomly generated planes Smooth transition [Asimov ‘83] [Buja, Cook, Asimov, Hurley. ‘04]
Grand Tour (Demo) Projection of a grand tour of six-dimensional data. Source: GGobi software.
14 Social networks Protein InteractionsInternet VLSI networks Data dependencies Neighborhood graphs
PRISM-HD What? – A novel mechanism for exploring complex data Why? – User is often overwhelmed with characteristics of data – Befuddled on where to start How? – Given, similarity measure-of-interest – Compute similarity graph at threshold (t) Key: Graphs are dimensionless – Provide user graph visualization cues User determines next threshold and repeats HD
HIGH THRESHOLD MODERATE THRESHOLD LOW THRESHOLD
Applications 1: Disaster Mgmt / Geodesic Overlays
Applications 2: Disaster Mgmt / Community Analysis [Fuhry, Ruan, and Parthasarathy. WebSci’12]
Applications 3: Social Network Analysis
Applications 3: Social Network Analysis (2)
Appendix
Nominal, Ordinal and Quantitative N - Nominal (labels) – Fruits: Apples, oranges, … O - Ordinal (rank-ordered) – Quality of meat: Grade A, AA, AAA Q - Interval (location of zero arbitrary) – Dates: Jan, 19, 2006; Location: (LAT 33.98, LONG ) – Like a geometric point. Cannot compare directly – Only differences (i.e. intervals) may be compared Q - Ratio (zero fixed) – Physical measurement: Length, Mass, Temp, … – Counts and amounts – Like a geometric vector, origin is meaningful S. S. Stevens, On the theory of scales of measurements, 1946 Slide courtesy: Jeffrey Stanford: A Brief Introduction to Data Visualization
Age Marital Status SingleMarriedDivorce d Widowed Year All Marital Status All Ages All Years Sum along Marital Status Sum along Age Sum along Year Slide courtesy: Jeffrey Stanford: A Brief Introduction to Data Visualization
Position (x 2) Size Value Texture Color Orientation Shape Visual encoding variables
Position Length Area Volume Value Texture Color Orientation Shape Transparency Blur / Focus … Visual encoding variables
Image courtesy: “Jer” of blprnt.com. “Just Landed”